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Abstract 


This chapter offers an introduction to the book and is 
specifically recommended for all readers intending to do 
the practical exercises it contains. It also provides readers 
with all the information they require to make the most of 
the book's contents. In this chapter, we explain the aim, 
structure and intended audience for this book. We also 
give the readers a few tips and guidelines about how to 
make best use of it. This is followed by a description of 
the software and the data used to do the practical 
exercises. In the last section of this chapter, we offer a 
detailed explanation about how we conducted the review 
of the LUC datasets carried out for Chap. *Land Use 
Cover Datasets: A Review" and Part IV of the book. 
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1 Introduction 


This chapter sets out the aims of this book and explains the 
methods and approaches applied in its production. It also 
aspires to be a guide, offering readers instructions as to how 
best to use the book. We therefore strongly encourage all 
readers to read this chapter carefully, so as to gain a clearer 
understanding of all the different aspects analysed in this 
book. This chapter also provides essential information for 
those wishing to do the practical exercises in this book. 

We begin by presenting the aims of the book and we offer 
a few tips explaining how each group of users can make best 
use of this book according to their particular requirements. 
Then, we provide information about the software and the 
data required to carry out the practical exercises presented in 
Parts II and III (Sect. 5). In the last section, we offer a 
detailed explanation of the review of LUC datasets carried 
out in Chap. “Land Use Cover Datasets: A Review” and 
Part IV (Sect. 6). 

The book is the fruit of two research projects which seek to 
provide a clearer understanding of the uncertainties associated 
with Land Use Cover maps and with the results of Land Use 
Cover Change modelling exercises (INCERTIMAPS Project: 
Suitability and uncertainty of land use and land cover maps for 
the analysis and modelling of territorial dynamics) and the 
promotion of Open Access software for teaching spatial science 
(PE117519: Herramientas para la Ensefianza de la Geomática 
con programas de Código Abierto). See complete information 
about these projects in the section Acknowledgements. 


2 What is the Main Aim of This Book? 


The aim of this book is to provide an up-to-date state of the 
art on Land Use Cover (LUC) datasets and validation tools. 
The book summarizes the available information and makes it 
accessible to any interested user, including some of the latest 
developments in the field. 
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The book was conceived as a practical tool to inform 
readers about currently available LUC datasets at global and 
supra-national scales and to help them understand more 
about the validation of LUC data and LUCC modelling 
exercises, so enabling them to validate their own data and 
models. To this end, the book combines brief theoretical 
explanations with practical information and exercises. 

Part I of the book briefly covers the theoretical founda- 
tions of LUC mapping, LUCC modelling and the analysis 
and assessment of their associated uncertainties. Parts II and 
III were conceived as practical guides to enable any reader to 
use any of the tools and data. Part II covers the visualization 
of LUC data and the production of reference datasets to 
validate LUC maps. Part III describes the use of common 
validation tools and the interpretation of their results. All the 
practical exercises are accompanied by an explanation of the 
basic theory behind them, so as to enable users to understand 
the analyses and the principles on which the techniques are 
based. Finally, Part IV of the book characterizes the most 
relevant available LUC data. It also provides all the neces- 
sary information as to how to download and use the datasets. 

As the book aims to reach the widest possible audience, 
the theory is briefly explained in simple, understandable 
terms. Practical exercises are implemented in QGIS, an 
open-source Geographical Information System, which can 
be downloaded for free. 


3 Who is the Book Aimed At? 


The book is aimed at anyone interested in Land Use Cover 
(LUC) mapping, Land Use Cover Change modelling and 
Land Use Cover Change analysis. Although to make full use 
of the book, some background in the field is recommended, 
it aims to be accessible and useful to all kinds of user, 
regardless of their level of expertise. Nonetheless, a basic 
knowledge of spatial analysis and GIS analysis is required to 
understand a lot of the information provided. 

The book will be particularly useful for researchers working 
in the fields of LUC mapping and LUCC modelling and espe- 
cially for those interested in validation methods and the avail- 
able sources of LUC data. Those interested in the application of 
open-source software in LUC may also find this book very 
useful, as it is the only book working with open-source software 
that focuses on these topics from a holistic perspective. For the 
QGIS community, the book provides the relevant information 
and tools to enable users to take full advantage of the software 
and expand the fields in which it can be effectively applied. 
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4 How to Use This Book? 


The book can be used in different ways, depending on the 
type of user and their particular background and interests. 
With this in mind, it has been conceived as a flexible tool 
that can be used for a wide variety of purposes. 

Beginners in this field are referred to Chap. “Land Use 
Cover Mapping, Modelling and Validation. A Back- 
ground", as are other users interested in gaining an overall 
picture of LUC mapping, LUCC modelling and the 
essential concepts required for uncertainty and validation 
analyses. This short, yet comprehensive chapter sets out 
the basic theoretical principles on which the rest of the 
book is based and is therefore recommended reading for 
all users. 

For LUC data visualization and creation, readers are 
referred to Part II of this book. It provides an overview of 
the different options available for symbolizing LUC data 
and LUC change in GIS. It also addresses some of the 
problems associated with the spatial visualization of LUC 
information. This part of the book also includes a tutorial 
on the creation of a set of sample points for LUC data 
validation with QGIS. 

Users interested in the validation of LUC datasets and 
Land Use Cover Change (LUCC) modelling exercises 
should refer to Chap. “Validation of Land Use Cover Maps: 
A Guideline". This provides guidelines for validating dif- 
ferent LUC products: single LUC maps, LUC map series, 
and outputs from LUCC modelling exercises. The different 
tools and methods referred to in these guidelines are then 
described in detail and applied in practice in the example 
exercises in Part III of the book. 

Users interested in doing the example QGIS exercises 
appearing in this book should refer to Sect. 5 of this chapter, 
which presents all the data and the cases studied in this book. 
It also offers essential information about the particular ver- 
sion of QGIS that we use and about how to integrate R 
software into QGIS, a necessary step when carrying out 
some of the exercises set out in the book. 

Those interested in LUC data sources should refer to 
Chap. *Land Use Cover Datasets: A Review", which offers 
an introduction to LUC mapping at global and supra-national 
scales, including a review of the different datasets available. 
Part IV of the book offers in-depth descriptions of most of the 
datasets that are available for download, detailing their 
specific characteristics and how they can be accessed. The 
methodology followed in the review of the datasets is 
described in Sect. 6 of this chapter. 
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5 QGIS Exercises: Software, Study Areas 
and Data 
5.1 GIS Software 
Of all the Geographical Information Systems (GIS) currently 
available, in this book we use QGIS, a well-known, 
open-source GIS software that is widely used and recog- 
nized. It provides a unified interface to many other relevant 
open-source GIS software programmes, such as SAGA, 
GDAL, GRASS or LasTools (Menke et al. 2016). It also 
allows integration with R, a powerful open-source software 
for statistical analysis. 

We opted for the QGIS 3.10.13 “A Coruña” version of 
QGIS for the practical exercises included in this book. This 
is because it was the newest long-term release version of 
QGIS available when we began writing the book. 

Users could try other versions of QGIS when doing the 
exercises included in this book. However, they should bear 
in mind that the exercises have been created and tested using 
the version indicated above and that certain issues and errors 
may arise when using any other version of QGIS. Earlier 
versions of QGIS prior to QGIS 3 are strongly discouraged, 
as important changes were made in the software between 
versions 2 and 3 and many features of QGIS 3 do not work 
in earlier versions of the software. 

The latest version of QGIS is available at the QGIS 
website (www.qgis.org). Users who require a specific ver- 
sion of this software should visit: https://qgis.org/ 
downloads/. Full documentation relating to the software 
can also be found at the official website: https://www.qgis. 
org/en/docs/index.html, where inexperienced QGIS users 
will find a brief introduction to the software interface and the 
main tools. 

Several user manuals are also available to help beginners 
make the most of the software. These include the books 
published by Packt (Graser et al. 2017; Cutts and Graser 
2018) and the series of manuals coordinated by Baghdadi 
et al. (2018a, b, c, d), which contain both generic and the- 
matic GIS exercises. 


5.2 QGIS Plugins 


QGIS works with plugins written in the C++ and Python 
programming languages. These plugins are an easy way to 
expand the capabilities of the software, which is why many 
of the features of the software are currently implemented 
through these plugins. 

There are two types of plugins: core and external plugins 
(QGIS Project 2020). The core plugins are maintained by the 
QGIS Development Team and automatically form part of the 
distributed software. The external plugins are developed by a 


community of users and are available at the QGIS Python 
Plugins Repository (https://plugins.qgis.org/plugins/). 

The external plugins may be up-to-date or outdated and 
are usually available for specific QGIS versions. The official 
plugin repository includes information about all these 
questions. External plugins that are still in the early stages of 
development and have not been widely used are marked by 
QGIS as experimental plugins and are not directly available 
through the software. 

Several QGIS plugins are used in the exercises presented 
in this book (Table 1). In all cases, we used the most 
up-to-date versions of these plugins as of when we began 
writing. Some of the plugins may have been updated since 
then, which could lead to certain differences in the interface 
and the results. This is something that readers should be 
aware of when using the plugins. 

The Semi-Automatic Classification Plugin is one of the 
most important QGIS plugins and is used in many of the 
exercises in this book. It was developed and updated by 
Luca Congedo (2016) and provides a comprehensive inter- 
face and set of tools for classifying remote sensing imagery. 
This includes many tools for validating image classifica- 
tions, which are also used in this book. For more information 
on the plugin and how to use it, users must refer to the 
plugin manual (Luca Congedo 2016) and official website 
(see Table 1). 

LecoS (Landscape ecology Statistics) is a plugin devel- 
oped by Jung (2016) to calculate the spatial metrics usually 
employed in the field of landscape ecology. Although other 
methods can be implemented in QGIS to calculate these 
metrics, the LecoS plugin is the best-known QGIS tool for 
this purpose. All the relevant information about the plugin is 
available at the official website (see Table 1). 

The R Processing Provider allows the R software capa- 
bilities to be integrated into QGIS. Full documentation on 
the plugin is available at the official website (see Table 1). 
Users can also find extra information on the plugin and the 
way the R language can be integrated into QGIS in the 
official documentation on QGIS.! To find out more about 
how to integrate R into QGIS, users should consult Sect. 5.3 
of this chapter. 

QuickMapServices is a very used QGIS plugin that 
allows to import to the QGIS interface many different 
web-map services of different kinds (XYZ tiles, TMS, 
WMS, WMTS, ESRI ArcGIS Services). More information 
on the plugin is available in the official website (see Table 1) 


! https://docs.qgis.org/3.4/en/docs/user manual/appendices/qgis r 
syntax.html#. https://docs.qgis.org/3.4/en/docs/user_manual/ 
appendices/qgis_r_syntax.html#syntax-summary-for-qgis-r-scripts. 


Table 1 QGIS plugins 
employed in the practical 
exercises of the book 


Plugin 
Processing R Provider 


Semi-automatic Classification 


LecoS—Landscape Ecology 
Statistics 


MapAccurAssess 
QuickMapServices 


Google Earth Engine Data 
Catalog 


and the manual recommended by the plugin’s authors, in 
Russian.” 

The Google Earth Engine Data Catalog plugin provides 
direct access in QGIS to the data catalog that takes part of 
the Google Earth Engine platform. Users will need a Google 
account to make use of this plugin. However, not much 
information is available about the plugin. If needing more 
information, users are referred to its official website (see 
Table 1). 

We also use MapAccurAssess, a plugin specifically 
developed for the exercises of this book by Dominguez Vera 
(2021). Although not available yet in the official QGIS 
plugin repository, it can be downloaded from the official 
repository of information accompanying this book (see 
Table 1). The plugin provides a tool for assessing the 
accuracy of classified Land Use Cover images, taking into 
account the recommendations made by Olofsson et al. 
(2013). For more information about the plugin, users are 
referred to the plugin manual, in Spanish (Dominguez Vera, 
2021). It is also available in the official repository for this 
book. 

To install any of these plugins in QGIS, access the 
“Manage and install plugins...” tool in the plugins menu to 
find the plugin you require. Once selected, click on the 
“Install Plugin” option (Fig. 1). In the “Settings” tab of the 
tool, users can also make experimental and deprecated plu- 
gins available in QGIS. To install MapAccurAssess, use the 
“Install from ZIP" tab, select the downloaded file and then 
click “Install Plugin" (Fig. 2). 


5.3 Integrating R into QGIS 

Some of the exercises presented in this book use R, a free, 
open-source statistical software. QGIS enables the R envi- 
ronment to be integrated into the software, making it easier 


? https://gis—lab-info.translate.goog/qa/quickmapservices.html?_x_tr_ 
sl-ru& x tr tlzen& x tr hlzen. 
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URL 
https://north-road.github.io/qgis-processing-r/ 


https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin. 
html 


https://conservationecology.wordpress.com/qgis-plugins-and-scripts/ 
lecos-land-cover-statistics/ 


https://doi.org/10.528 1/zenodo.5419130 
https://nextgis.com/blog/quickmapservices/ 
https://github.com/sandroklippel/qgis_gee_data_catalog/wiki 


for any QGIS user to take full advantage of the tools 
available through R. 

QGIS does not have the required tools to compute all the 
validation tools and methods that have been reviewed in this 
book. We have therefore had to implement some of them in 
QGIS through the R processing environment. Users wishing 
to find out more about R and its integration into QGIS, with 
practical exercises about how to use both software packages 
in combination, should consult the manual by Islam (2018). 

To integrate R into QGIS, users must begin by down- 
loading the R software. R and any of its associated data can 
be downloaded from a comprehensive file network, from 
which users must select the mirror closest to their location at 
https://cran.r-project.org/mirrors.html. 

Once downloaded and installed, users must also install a 
series of packages in R to execute the different tools and 
methods included in the book (Table 2). This step cannot be 
carried out through the QGIS interface. Users must open R 
and manually install the different packages. To do this, select 
Packages > Install Package(s)... from the menu (Fig. 3). In 
the window that opens, select the mirror from which to 
download the packages (Fig. 4). Finally, select the package 
to be installed (Fig. 5). Installation of the package may take 
a little while to complete. Installation is complete when the 
R console allows the user to write new code (Fig. 6). 

Table 2 lists the packages required to do the different 
exercises appearing in this book. In the table, next to each 
package name, we offer a link to the website with all the 
information about the package: description, download link, 
reference manual, etc. 

After installing R and the required packages, we need to 
install the QGIS plugin that allows us to integrate the two 
software packages. This is the “Processing R provider" 
plugin. Instructions to this end can be found in Sect. 5.2 of 
this chapter. After installing the plugin, users must download 
the scripts we have developed to integrate the R tools and 
capabilities into QGIS. These scripts are listed in Table 3 
and are available at https://doi.org/10.5281/zenodo.5418985 
in the official repository for this book. 
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Q Plugins | All (894) 


All 
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Not installed 
Upgradeable 


Install from ZIP 


Settings 


à Search. 

® Layman 

® Layout Loader 

$ Layout Map Cropper 

Š LCCS3 Basic Coder 3 

1$» Least-Cost Path 

$» Least-Cost-Paths Netwc 
LecoS 

$ LF Tools 

Š LiDAR Forestry Height 

® Lidar Tile Maker 

Š LightPollutionToolbox 

® Line direction histograrr 

® Line Profile 

® Line similarity 

3$» Linear Theory Orograph 

$» LineComparison 

® Lines around Points 

$ Lines Ranking 


Landscape Ecol 


LecoS - Landscape Ecology Statistics 


Contains several analytical functions for land cover analysis 


LecoS is based on metrics taken from FRAGSTATS: 
http://www.umass.edu/landeco/research/fragstats/fragstats.html Functions include the 
calculation of metrics on rasters and vector layers. A polygon overlay tool is also available to 
ease up computation. Additionally LecoS provides some functionalities to manipulate classified 


raster images. 


Yr Yr frr Yr 87 rating vote(s), 103726 downloads 


Tags 

More info 

Author 

Available version (stable) 


~ | Upgrade All | 


Fig. 1 QGIS plugins. Standard plugin installation workflow 


Fig. 2 QGIS plugin. Plugin 
installation from a zip file 


Table 2 List of R packages 


ecology, statistics, land cover, landscape, analytical tools 
homepage bug tracker code repository 

Martin Jung 

3.0.0 


Install Plugin 


Close Help 


ra 
FJ Installed 


Not installed 


Q Plugins | Install from ZIP 


If you are provided with a zip package containing a plugin to install, please 
select the file below and click the 7nsta// plugin button. 


Upgradeable 


$ Install from ZIP 


Settings 


Package 


required to use the R scripts 


provided in this book 


intensity.analysis 
raster 

Rgdal 

ROCR 

sabre 


sf 
sp 


Please note for most users this function is not applicable, as the preferable way 
is to install plugins from a repository. 


ZIP file: [C:\Users\David\Downloads\accuracy_assessment-master.zip a | |... 


URL 


Install Plugin 


https://cran.r-project.org/web/packages/intensity.analysis/index.html 


https://cran.r-project.org/web/packages/raster/index.html 


https://cran.r-project.org/web/packages/rgdal/index.html 


https://cran.r-project.org/web/packages/ROCR/index.html 


https://cran.r-project.org/web/packages/sabre/index.html 


https://cran.r-project.org/web/packages/sf/index.html 


https://cran.r-project.org/web/packages/sp/index.html 


Fig. 3 Integrating R in QGIS. 
Installing the required pachakes 
in R: first step 


Fig. 4 Integrating R in QGIS. 


Installing the required pachakes 


in R: second step (mirror 
selection) 
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R RGui (64-bit) 
File Edit View Misc Packages Windows Help 


DO gies 
Set CRAN mirror... 
Select repositories... 
R version 3.6.3 De mip) L, 
Copyright (C) 20; Update packages... Computing 
Platform: x86 64- Install 
- package(s) from local files... 


R is free software and comes with ABSOLUTELY NO WARRANTY. 
You are welcome to redistribute it under certain conditions. 
Type 'license()' or 'licence()' for distribucion details. 


R is a collaborative project with many contributors. 
Type 'contributors()' for more information and 
'cication()' on how to cite R or R packages in publications. 


Type 'demo()' for some demos, 'help()' for on-line help, or 
'help.start()' for an HTML browser interface to help. 
Type 'q()' to quit R. 
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QR RGui (64-bit) 
File Edit View Misc Packages Windows Help 


Eeu eee e RES 


'R Console 


R version 3.6.3 (2020-02-29) -- "Holding thg 
Copyright (C) 2020 The R Foundation for Stat 
Platform: x86 64-w64-mingw32/x64 (64-bit) 


R is free software and comes with ABSOLUTEL 
You are welcome to redistribute it under ce 
Type 'license()' or 'licence()' for distribu 


R is a collaborative project with many cont 
Type 'contributors()' for more information 4 
'cication()' on how to cite R or R packages 


Type 'demo()' for some demos, 'help()' for q 
'help.start()' for an HTML browser interface 
Type 'q()' to quit R. 


> utils:::menuInstallPkgs() 


--- Please select a CRAN mirror for use in 


Once downloaded, the script files must be pasted into the 
R scripts folder of QGIS. The path to this folder can be 
found in the “Options” menu of QGIS. To access it, go to 
Settings > Options... and then select the “Processing” 


submenu (Fig. 7). In the “Providers” tab, there is a specific 
tab for “R”. After opening this tab, a list appears including 
the “R scripts folder" path, which indicates where users must 
save the scripts that come with the book. 
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Fig. 5 Integrating R in QGIS. QM nui (64-bit 
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> utils: ::menuInstallPkgs() 

--- Please select a CRAN mirror for use in this session --- 
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Fig. 7 Integrating R in QGIS. 
R configuration in QGIS 
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Table 3 List of the R scripts developed for use in this book 


Script file 
LUCCBudget.rsx 


Intensity_analysis.rsx 


Stable_change_flow_matrix. 
rsx 


Flow_matrix_graf.rsx 


Correlation.rsx 
ROCAnalysis.rsx 


MapCurves raster.rsx 


MapCurves vector.rsx 


Change Statistics.rsx 


Individual Areal 
Inconsistency.rsx 


Individual Spatial Agreement. 
ISX 


Overall Areal Inconsistency. 
ISX 


Overall Spatial Agreement.rsx 


Overall Spatial Inconsistency. 
ISX 


Local accuracy assessment 
statistics.rsx 


Function 


LUCC budget (Sect. 2 in Chap. "Pontius Jr. Methods Based on a Cross-Tabulation Matrix to Validate Land 
Use Cover Maps") 


Intensity analysis (Sect. 6 in Chap. "Pontius Jr. Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps") 


Flow matrix (Sect. 7 in Chap. "Pontius Jr. Methods Based on a Cross-Tabulation Matrix to Validate Land Use 
Cover Maps") 


Flow matrix (Sect. 7 in Chap. "Pontius Jr. Methods Based on a Cross-Tabulation Matrix to Validate Land Use 
Cover Maps") 


Correlation (Sect. 1 in Chap. “Validation of Soft Maps Produced by a Land Use Cover Change Model”) 
ROC analysis (Sect. 2 in Chap. “Validation of Soft Maps Produced by a Land Use Cover Change Model") 


Map curves (recommended for raster data) (Sect. 1 in Chap. “Advanced Pattern Analysis to Validate Land 
Use Cover Maps") 


Map curves (recommended for vector data) (Sect. 1 in Chap. “Advanced Pattern Analysis to Validate Land 
Use Cover Maps") 


Change statistics (Sect. 1 in Chap. “Metrics Based on a Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") 

Areal and spatial agreement metrics (Sect. 2 in Chap. “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") 


Areal and spatial agreement metrics (Sect. 2 in Chap. “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") 


Areal and spatial agreement metrics (Sect. 2 in Chap. “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") 


Areal and spatial agreement metrics (Sect. 2 in Chap. “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") 


Areal and spatial agreement metrics (Sect. 2 in Chap. “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") 


Overall, user and producer's accuracies through GWR (Sect. 1 in Chap. *Geographically Weighted Methods 
to Validate Land Use Cover Maps") 
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Fig. 8 Location map of the 
Asturias Central Area 


5.4 Study Areas 


The exercises provided in this book are applied to three 
specific study areas: the Ariége Valley (France), the Asturias 
Central Area (Spain) and the Marqués de Comillas munici- 
pality (Mexico). We now offer a brief introduction to these 
study areas, so as to give readers the contextual information 
they require for a clearer understanding of the results of the 
exercises. 


5.4.1 The Asturias Central Area (Spain) 

The Asturias Central Area is a rural-industrial-urban area 
located in the heart of Asturias, in Northern Spain (Fig. 8). It 
hosts around 80% of the Asturian population and most of its 
economic activity (Rodríguez Gutiérrez et al. 2009). It is 
made up of a polycentric set of cities of different sizes that 
play à complementary socioeconomic role. The cities are 
surrounded by a network of villages and plenty of rural 
space, where a traditional rural economy and lifestyle is 
mixed with peri-urban dynamics (Rodríguez Gutiérrez et al. 
2013). 


Industrial areas 


= 
mm Urban areas 


— Motorway 
Other roads 


The cities at the top of the urban hierarchy are Oviedo, 
Gijon and Avilés, which concentrate most of the urban LUC 
dynamics in recent decades (Gobierno del Principado de 
Asturias 2016). The area within the triangle formed by the 
three cities has also been the subject of important LUC 
dynamics, with the emergence of new industrial and resi- 
dential developments, attracted by the accessibility that the 
area’s extensive transport network provides (Méndez Garcia 
and Ortega Montequin 2013). The south of the Asturias 
Central Area is dominated by small industrial cities, mainly 
Mieres and Langreo, located in long, narrow valleys where 
there is almost no new space for development (Prada Trigo 
2011). These were formerly mining/industrial towns which 
are now in decline. 


5.4.2. Ariège Valley (France) 

The Ariége Valley area consists of the central part of the 
valley formed by the River Ariége, which is situated is in the 
department of the same name about 70 km south of Tou- 
louse (Fig. 9). It covers an area of 1113 km? and has a 
population of about 80,000 inhabitants. The Ariége Valley is 


10 


Fig. 9 Map showing the location 
of the Ariège Valley 
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a rural area with agriculture in the northern part and wooded 
land in the south, approaching the Pyrenees. The largest 
town is Pamiers, in the centre of the valley, with about 
15,000 inhabitants, while the departmental capital, Foix, has 
a population of 9700. Saverdun, in the north of the valley, 
has 4900 inhabitants. 

In the past, the Ariége Valley was a centre for industrial 
and mining activities while today it is mainly rural. Tourism 
is increasingly common. The most notable LUC dynamics 
are reforestation and the increase in built-up areas, which are 
mainly concentrated along the river. 


5.4.3 Marqués de Comillas (Mexico) 

Marques de Comillas is a physiographical region of the 
Lacandon rainforest in Chiapas, Mexico (Fig. 10). Bounded 
by two rivers, the Usumacinta and the Lacantun, it 


comprises approximately 1596 (2032 km?) of the Lacandon 
region. The climate is hot and humid, with an average annual 
temperature of 24.3 °C and average annual precipitation of 
2960 mm, most of which falls from May to December 
(García-Amaro 2004). 

A colonization programme by the Mexican Government 
in the 1970s encouraged the establishment of farming 
communities in forest-covered areas, promoting agriculture, 
agroforestry (cacao) and cattle ranching, which is currently 
the most important business activity. Over the last 40 years, 
Marqués de Comillas has suffered a dramatic loss in forest 
cover; in the mid-1980s, forests occupied 83% of the region, 
while today, this has fallen to just 29%, less than half of 
which are well-preserved forests. The landscapes are now 
made up above all of mosaics of agricultural lands, cattle 
pastures and human settlements. 


About This Book 


Fig. 10 Location of Marqués de "e 
Comillas 0 7.5 15km 
O — | 


Bl Water body 
Bl Human settlement 
— Paved road 

Unpaved road 


Lacandon rainforest 


\ E 
\\ Gulf of Mexico | 
N f 


s p 


Mexico 


AMÉRICAS ES OXON. 4 o 
AMERICAS Pacific Ocean \, (Guatemala? 
aci! can NY » aid 


5.5 Data 


All the data used in the example exercises provided in this 
book can be found online and downloaded at https://doi.org/ 
10.5281/zenodo.5418318 in the official repository for this 
book. This data consists of LUC maps for the three different 
study areas (Ariége Valley, Asturias Central Area and the 
Marqués de Comillas municipality) and the data from LUC 
modelling exercises for the first two. The data for Ariége 
Valley comes from the work carried out Nabila Bounoua and 
Jéromine Le Campion, students of the Master in Geomat- 
ics SIGMA at the University of Toulouse Jean Jaurés. 

Detailed information on the LUCC modelling exercises 
developed for the Asturias Central Area and Ariége Valley 
can be found in studies by García-Álvarez et al. (2019) and 
Bounoua and Le Campion (2019). The LUC maps for these 
two areas were obtained from two different datasets: COR- 
INE Land Cover, SIOSE. The LUC map for the Marqués de 
Comillas municipality was obtained through the classifica- 
tion of satellite imagery. 

We will now briefly describe the LUC datasets and maps 
that form part of the database for each study area. At the end of 
this section, there is a table with all the files used in this book. 

CORINE Land Cover (CLC) is a pan-European dataset of 
LUC information available for five different dates from 1990 
to 2018. It provides detailed, coherent LUC information for 
most of the countries in Europe. It is usually carried out by 
photointerpretation in vector format at a scale of 1:100,000, 
with a Minimum Mapping Unit (MMU) of 5—25 ha and a 
Minimum Mapping Width (MMW) of 100 m. Detailed 


information about this dataset can be found in Chap. * 
General Land Use Cover datasets for Europe" of this book. 

A simpler version of CLC is used in the Ariége Valley 
(Fig. 11) and the Asturias Central Area (Fig. 12) case 
studies. In the latter, CLC is available in both vector and 
raster format. Although CLC is officially distributed in raster 
format at a spatial resolution of 100 m, the CLC rasters for 
the study areas in this book are provided at a different spatial 
resolution: 50 m for Asturias and 15 m for Ariége. These 
rasters were obtained after rasterizing the CLC vector layers. 

SIOSE (Sistema de Información sobre Ocupación del 
Suelo de Espafia) is a Spanish dataset in vector format that 
provides very detailed LUC information. It was obtained by 
photointerpretation of aerial imagery at 1:25,000, with a 
MMU of 0.5-2 ha and a MMW of 10 m. It follows a 
specific data model aimed at objects, which means that all 
the land uses and covers in a polygon are described by a 
specific code. This means that instead of being assigned to a 
specific LUC category, each polygon is described by a code 
detailing its LUC composition. 

Some of the maps in the Asturias Central Area case study 
were obtained after simplification of the SIOSE database. 
The maps were obtained after the classification of each 
SIOSE polygon into a single category and after the rasteri- 
zation at 50 m of the original vector dataset (Fig. 12). More 
information on how this operation was performed can be 
found in García-Álvarez (2018). Extra information about the 
characteristics of SIOSE can be found in Valcárcel et al. 
(2008) and García-Álvarez and Camacho Olmedo (2017). 

The Marques de Comillas LUC map (Fig. 13) is part of a 
database on Land Cover and Land Cover/Land Use Changes 
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Fig. 11 Land Use Cover map 
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Fig. 13 Land Use Cover Map 
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in the State of Chiapas in Mexico. The original database 
covers 7.5 million ha, of which the Marqués de Comillas 
map covers a small section of approximately 200,000 ha. 
The maps were computed via a supervised classification 
of 2019 Sentinel-2 imagery. They were subsequently 
photo-interpreted to correct errors from the supervised stage 
as well as to include information on agricultural land uses. 
The map contains eight thematic categories describing levels 
of forest conservation, and other land uses; the approximate 
scale is 1:40,000, with an MMU of one ha. More 
information can be found at the following link: https:// 
bosqueschiapasdemo.ecosur.ourecosystem.com/. 


Ariége Valley (Val d’Ariége) 


Projection: WGS84/UTM 31N (EPSG: 32631) 
Associated files: BD_Val_Ariege (Word document file): 
explanation and legend 


In the following tables, we list the files from the different 
datasets and LUC modelling exercises described above that 
have been used in different exercises in this book. More 
datasets are available online, including extra LUC maps and 
model drivers not considered in the exercises in this book. 

The tables include information about the name of the file 
available for download and the descriptive name used to 
refer to these files in the book. For each dataset, we also 
provide the projection of the dataset and the file describing 
the legend of the maps. A document listing all these char- 
acteristics for the layers only available online is also pro- 
vided when downloading the data. 


File name Name in the book 
CORINE Land Cover 
CLC_2000 CORINE Land Cover Map Val d'Ariége 2000 
CLC_2012 CORINE Land Cover Map Val d’Ariége 2012 
CLC 2018 CORINE Land Cover Map Val d'Ariege 2018 
Model Drivers 
Roads dist Distance to roads 


Simulation output 


CLC predict 2018 


Simulation LCM Val d'Ariége 2018 


CLC predict 2018 soft UTM 


Soft prediction LCM Val d'Ariége 2018 


00 12 18 transition 2 to 1 


"Transition potential map from agricultural to artificial areas 


00 12 18 transition 3 to 1 


Transition potential map from forests to artificial areas 


Matkov18 classl utm 


Markovian probability map for artificial areas Ariège Valley 
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Projection: WGS84/UTM 30N (EPSG: 32630) 
Associated files: Legend Asturias maps (spreadsheet) 
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File name Name in the book 

CORINE Land Cover 
C05.shp CORINE Land Use Vector Map Asturias Central Area 2005 
C05.rst CORINE Land Use Map Asturias Central Area 2005 
C05 v0O.rst CORINE Land Use Map Asturias Central Area 2005 v.0 
C11.shp CORINE Land Use Vector Map Asturias Central Area 2011 
C11.rst CORINE Land Use Map Asturias Central Area 2011 
Changes CORINE 05. 11.rst CORINE Land Use Changes Asturias Central Area 2005-2011 
C18.rst CORINE Land Use Map Asturias Central Area 2018 

SIOSE 
S11.shp SIOSE Land Use Vector Map Asturias Central Area 2011 
S11.rst SIOSE Land Use Map Asturias Central Area 2011 


CORINE Land Use Cover Change model and simulation 


Simulation, C11.rst 


Simulation CORINE Asturias Central Area 2011 


Simulation C11. SIOSEDemands.tst 


Simulation CORINE 2 Astutias Central Area 2011 


Simulated. changes CORINE 05. 11.rst 


Simulated CORINE changes Asturias Central Area 2005- 
2011 


C Suitability. factor Urban, Fabric.rst 


Urban fabric suitability map — CORINE model 


Marqués de Comillas 


Projection: WGS84/UTM 15N (EPSG: 32615) 
Associated files: Marques LUC datasets (Word document 
file): dataset description and legend 


File name 


Name in the book 


LandCoverMarques2019.tif 


Marqués de Comillas Land Use Land Cover Map 2019 


RandomSample_Buffer.shp 


Photo-interpreted reference dataset — Marqués de Comillas 
2019 


RandomSample.shp 


Centroids of sample sites - Marqués de Comillas 


random, sample, points.shp 


Marqués de Camilla random sample points from Mexico 
(2019) 


limit.shp 


Boundaties of Marqués de Comillas study area 


6 Review of Land Use Cover Datasets 


Chapter “Land Use Cover Datasets: A Review” and Part IV 
of the book contain a review of the Land Use Cover datasets 
available at global and supra-national scales. Due to the 
limited extent and scope of this book, we did not review 
national and regional LUC datasets, which are far too 
numerous for our purposes. 

The datasets we reviewed are classified into two groups, 
depending on the information they provide. The first group 
is made up of the datasets that provide information about the 
different land uses or covers without focusing on any one of 
them in particular, i.e. general LUC datasets. The second 


group consists of the LUC datasets that map a specific land 
use or cover in detail (e.g. vegetation, croplands, built-up 
areas...). These are referred to as thematic LUC datasets. 
Some datasets are difficult to assign to one of the two 
groups, as they map a wide range of LUC categories while 
also providing specific detail on just one of them. The 
authors decided which group to assign them to on a 
case-by-case basis. 

The datasets were also classified according to their extent, 
differentiating between global and supra-national LUC data- 
sets. The first group of datasets maps land uses or covers all 
over the Earth, while the second maps them for a specific area 
covering more than one country. The maps in the second group 
may cover a whole continent or focus on just a few countries. 
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Table 4 List of repositories and 
web portals distributing LUC 
information at global and 
supra-national scales 


Web portal 


FAO GeoNetwork 


Geo-Wiki 


Google Earth Engine 


Land Processes Distributed Active 
Archive Center (LP DAAC) 


Wekeo 


FROM-GLC 


When making the review, we consulted the most relevant 
web portals and repositories of LUC data (Table 4). A few 
selected papers, reports and other relevant documents 
reviewing or comparing LUC datasets were also consulted 
(Manakos and Braun 2014; Mora et al. 2014; Grekousis 
et al. 2015; Tsendbazar et al. 2015; Diogo and Koomen 
2016; Klotz et al. 2016; Pérez-Hoyos et al. 2017; Fritz et al. 
2019). 

Very old or outdated maps, which were produced 
according to traditional cartographic methods, are not 
included in this review. Nor are other old maps that combine 
LUC information with other data about climate or biogeo- 
graphic variables, such as the maps produced by Matthews 
(1983) and Olson et al. (1983). Traditional maps obtained 
through photointerpretation of aerial imagery and field sur- 
vey, which offer information about certain specific land 
covers such as vegetation and agricultural areas, are not 
included in the review either. Although they may be inter- 
esting sources for historical LUC change analysis, they are 
usually only available for national or more detailed areas and 
normally have not been digitalized. 

There are plenty of other spatial datasets that provide 
important information for studying specific land covers. For 
vegetation covers, maps of live biomass are a good example 
(Kindermann et al. 2008; Thurner et al. 2014). These data- 
sets were not included in our review because they are not 
specific sources of LUC information focusing exclusively on 


Copernicus Land Monitoring Service 


Description 


Web portal for the thematic land monitoring service provided by 
the Copernicus programme. It offers information on land 
monitoring at global, pan-European and local scales for the 
European Union 


Catalog of spatial datasets developed and maintained by the 
Food and Agriculture Organization (FAO). It has a specific 
section on LUC information 


Platform developed to collect LUC information via 
crowdsourcing. Its viewer hosts different LUC datasets, either 
from external projects or developed through the Geo-Wiki 
project and similar crowdsourcing approaches 


Catalogue of spatial datasets to be used as part of Google Earth 
Engine. There is a specific section on LUC datasets 


Archive of spatial data managed in partnership by the USGS and 
NASA, which distributes most of the information produced by 
these institutions 


Copernicus Data and Information Access Service (DIAS), which 
provides a cloud-based platform to access and process 
Copernicus data. It includes a catalogue of data with many LUC 
products produced within the context of the Copernicus 
programme 


Web portal developed and maintained by experts from Tsinghua 
University, which distributes all the LUC datasets produced by 
the team associated with this university since the FROM-GLC 
project 


land cover. However, there is an enormous amount of data 
like this that may be useful for the study and characterization 
of LUC. This data comes in many different forms and from a 
range of different sources. 

Part IV of the book characterizes in detail all the reviewed 
LUC datasets that are currently available for download and 
may be relevant for a wide community of users. Datasets 
produced at very coarse scales or which are already very 
outdated are not described in Part IV, as they are of limited 
utility for most members of the LUC community. LUC 
datasets currently unavailable for download are not charac- 
terized in Part IV either. We tried to obtain, either online or 
by contacting the authors, all the global or supra-national 
datasets to which we found references. Some of them, 
however, are no longer available. These datasets have not 
been reviewed. 

The LUC datasets described in Part IV were characterized 
according to the following elements: information about the 
project or context within which they were produced; infor- 
mation about their method of production; description of the 
data available for download; and practical information for 
using the dataset in an effective way. For each dataset we 
also provide all the technical references in which it is 
described as well as other references of interest in which it is 
used or analysed. A table summarizing the main character- 
istics of the dataset (extent, temporal availability, spatial 
resolution, updates, accuracy...) is also provided. 


16 


References 


Baghdadi N, Mallet C, Zribi M (2018a) QGIS and generic tools. Wiley, 
Hoboken 

Baghdadi N, Mallet C, Zribi M (2018b) QGIS and applications in 
territorial planning. Wiley, Hoboken 

Baghdadi N, Mallet C, Zribi M (2018c) QGIS and applications in water 
and risks. Wiley, Hoboken 

Baghdadi N, Mallet C, Zribi M (2018d) QGIS and applications in 
agriculture and forest. Wiley, Hoboken 

Bounoua N, Le Campion J (2019) L'évolution de l'occupation des sols 
et notamment de l'urbanisation sur la Vallée de l'Ariége. Rapport de 
Master II Géomatique SIGMA 

Cutts A, Graser A (2018) Learn QGIS: your step-by-step guide to the 
fundamental of QGIS 3.4, 4th ed. Packt, Birmingham 

Diogo V, Koomen E (2016) Land cover and land use indicators: review 
of available data. OCDE Green Growth Papers. https://www.oecd- 
ilibrary.org/environment/land-cover-and-land-use-indica- 
tors 5jlr2z86r5xw-en. Accessed 26 June 2021 

Dominguez Vera RdJ (2021) Documentación para el plugin MapAc- 
curAssess. https://doi.org/10.528 1/zenodo.5419130. Accessed 26 
June 2021 

Fritz S, See L, Bayas JCL et al (2019) A comparison of global 
agricultural monitoring systems and current gaps. Agric Syst 
168:258-272. https://doi.org/10.1016/j.agsy.2018.05.010 

Garcia-Alvarez D, Camacho Olmedo MT (2017) Changes in the 
methodology used in the production of the Spanish CORINE: 
uncertainty analysis of the new maps. Int J Appl Earth Obs Geoinf 
63:55—67. https://doi.org/10.1016/j.jag.2017.07.001 

García-Álvarez D, Camacho Olmedo MT, Paegelow M (2019) 
Sensitivity of a common land use cover change (LUCC) model to 
the minimum mapping unit (MMU) and minimum mapping width 
(MMW) of input maps. Comput Environ Urban Syst 78. https://doi. 
org/10.1016/j.compenvurbsys.2019.101389 

Garcia-Alvarez D (2018) The influence of scale in LULC modelling. 
A comparison between two different LULC maps (SIOSE and 
CORINE). In: Camacho Olmedo MT, Paegelow M, Mas J-F, 
Escobar F (eds) Geomatic approaches for modeling land change 
scenarios. Springer, Cham, Switzerland, pp 187-213 

García-Amaro E (2004) 1928-1999: modificaciones al sistemas de 
clasificación climática de Kóppen. Edit Instituto de Geografía, 
UNAM, México 

Gobierno del Principado de Asturias (2016) Directrices subregionales 
de ordenación del Área Central de Asturias. Avance: objetivos y 
criterios 

Graser A, Mearns B, Mandel A, et al (2017) QGIS: becoming a GIS 
Power User. Learning path. Packt, Birmingham, Mumbai 

Grekousis G, Mountrakis G, Kavouras M (2015) An overview of 21 
global and 43 regional land-cover mapping products. Int J Remote 


Sens | 36:5309—-5335. https://doi.org/10.1080/01431161.2015. 
1093195 

Islam S (2018) Hands-on geospatial analysis with R and QGIS. Packt, 
Birmingham 


Jung M (2016) LecoS—a python plugin for automated landscape 
ecology analysis. Ecol Inform 31:18-21. https://doi.org/10.1016/]. 
ecoinf.2015.11.006 

Kindermann GE, McCallum I, Fritz S, Obersteiner M (2008) A global 
forest growing stock, biomass and carbon map based on FAO 
statistics. Silva Fenn 42:387—396. https://doi.org/10.14214/sf.244 


D. García-Álvarez et al. 


Klotz M, Kemper T, Geif C et al (2016) How good is the map? A 
multi-scale cross-comparison framework for global settlement 
layers: evidence from Central Europe. Remote Sens Environ 
178:191-212. https://doi.org/10.1016/j.rse.2016.03.001 

Luca Congedo (2016) Semi-automatic classification Plugin documen- 
tation release 4.8.0.1. https://buildmedia.readthedocs.org/media/pdf/ 
semiautomaticclassificationmanual-v4/latest/ 
semiautomaticclassificationmanual-v4.pdf. Accessed 26 June 2021 

Manakos I, Braun M (2014) Land use and land cover mapping in 
Europe. In: Practices and trends. Springer, Dordrecht, Heidelberg, 
New York, London 

Matthews E (1983) Global vegetation and land use: new 
high-resolution data bases for climate studies. J Clim Appl Meteorol 
22:474-487. https://doi.org/10.1175/1520-0450(1983)0222.0.CO;2 

Méndez Garcia B, Ortega Montequin M (2013) Ciudad difusa y 
territorio: el caso del Area Central Asturiana. Ciudad Rev Del Inst 
Urbanistica La Univ Valladolid 16:131—144 

Menke K, Smith R, Pirelli L, Van Hoesen J (2016) Mastering QGIS. 
Packt, Birmingham 

Mora B, Tsendbazar N-E, Herold M, Arino O (2014) Global land cover 
mapping: current status and future trends. In: Manakos I, Braun M 
(eds) Land use and land cover mapping in Europe. Practices and 
trends. Springer, Dordrecht, Heidelberg, New York, London, 
pp 11-30 

Olofsson P, Foody GM, Stehman SV, Woodcock CE (2013) Making 
better use of accuracy data in land change studies: estimating 
accuracy and area and quantifying uncertainty using stratified 
estimation. Remote Sens Environ 129:122-131. https://doi.org/10. 
1016/j.rse.2012.10.031 

Olson JS, Watts JA, Allison LJ (1983) Carbon in live vegetation of 
major world ecosystems. https://cdiac.ess-dive.lbl.gov/epubs/ndp/ 
ndp017/ndp017appA.pdf. Accessed 26 June 2021 

Pérez-Hoyos A, Rembold F, Kerdiles H, Gallego J (2017) Comparison 
of global land cover datasets for cropland monitoring. Remote Sens 
9:1118. https://doi.org/10.3390/rs9111118 

Prada Trigo J (2011) Desarrollo, patrimonio y políticas de revital- 
ización en ciudades intermedias de especialización 
minero-industrial. In: El caso de Langreo (Asturias). Consejo 
Económico y Social del Principado de Asturias, Oviedo 

QGIS Project, 2020QGIS Project (2020) QGIS user guide. Release 3.4 

Rodríguez Gutiérrez F, Menéndez Fernández R, Fernández Prieto JA 
(2013) Las villas en el sistema territorial asturiano. Eria 90:31-54 

Rodríguez Gutiérrez F, Menéndez Fernández R, Blanco Fernández J 
(2009) El área metropolitana de Asturias. In: Ciudad Astur: el 
nacimiento de una estrella urbana en Europa. Trea, Oviedo 

Thurner M, Beer C, Santoro M et al (2014) Carbon stock and density of 
northern boreal and temperate forests. Glob Ecol Biogeogr 23:297— 
310. https://doi.org/10.1111/geb.12125 

Tsendbazar NE, de Bruin S, Herold M (2015) Assessing global land 
cover reference datasets for different user communities. ISPRS J 
Photogramm Remote Sens 103:93-114. https://doi.org/10.1016/j. 
isprsjprs.2014.02.008 

Valcarcel N, Villa G, Arozarena A, et al (2008) SIOSE, a successful 
test bench towards harmonization and integration of land cover/use 
information as environmental reference data. In: Chen J, Jiang J, 
Peled A (eds) The international archives of the photogrammetry, 
remote sensing and spatial information sciences, vol XXXVII, part 
B8. ISPRS, Beijing 


About This Book 


Open Access This chapter is licensed under the terms of the Creative 
Commons Attribution 4.0 International License  (http:// 
creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as 
long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if 
changes were made. 


17 


The images or other third party material in this chapter are included in 
the chapter's Creative Commons license, unless indicated otherwise in 
a credit line to the material. If material is not included in the chapter's 
Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to 
obtain permission directly from the copyright holder. 


Part | 
Concepts, Data and Validation 


f 


Check for 
updates 


David García-Álvarez, María Teresa Camacho Olmedo, 
Jean-Frangois Mas, and Martin Paegelow 


Abstract 


In this chapter, we offer a brief introduction to the main 
concepts associated with Land Use Cover (LUC) map- 
ping, Land Use Cover Change (LUCC) modelling and the 
uncertainty and validation of LUC and LUCC data and 
model outputs. The chapter summarizes the theoretical 
fundamentals required to understand the rest of the book. 
First, we define Land Use and Land Cover concepts that 
have been extensively discussed and debated in the 
literature (Sect. 2). Second, we review the history of LUC 
mapping, from the first manually produced maps to the 
advent of aerial and satellite imagery and the production 
of new datasets with much greater detail and accuracy 
(Sect. 3). Third, we address the usefulness of LUC data 
and LUCC analysis for society (Sect. 4), contextualizing 
all these studies and efforts within the framework of Land 
Change Science (Sect. 5). Fourth, we offer a brief 
introduction to LUCC modelling, its purpose, uses and 
the different stages that make up a LUCC modelling 
exercise (Sect. 6). We also offer a brief introduction to the 
different types of LUCC models currently available. 
Finally, we present the concepts of uncertainty and 
validation and offer a brief introduction to the topic 
(Sect. 7. The chapter also includes a short list of 
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recommendations for further reading for those who wish 
to explore the theory presented here in more depth. 
Keywords — — 
Land Use * Land Cover * Land Use Cover Change * 


Land Use Cover mapping * Land Change Science * Land 
Use Cover Change modelling * Uncertainty * Validation 


1 Introduction 


Land Use and Land Cover (LUC) data is an important source 
of information for a wide range of users from different 
backgrounds and scientific disciplines. It provides an over- 
view of the different covers on the Earth's surface (e.g. 
vegetation, agricultural fields, rocks, water, artificial sur- 
faces...) and how they evolve over time. It also traces how 
these covers are used (land use) and how this use changes. 

LUC data can be very useful in an array of different 
fields. It is especially valuable for understanding the impact 
that many natural and human-induced processes, such as 
climate change, deforestation and urbanization, can have on 
the Earth's surface. As a result, LUC research has been 
receiving increasing attention over recent decades, and the 
number of fields making use of this data is on the rise. 

Researchers have been proposing new methods and 
techniques for producing LUC maps. This has increased the 
number of LUC datasets available at global, continental, 
regional and local scales. This has also led to an increase in 
the number of users who decide to make their own LUC 
maps. The validation of LUC data has also been the subject 
of specific research and new methods, strategies and tech- 
niques have been proposed for validating and analysing 
LUC maps. 

Despite all these advances, many users are still unaware 
of the wide range of datasets available, while others lack a 
clear understanding of the methods or techniques that can be 
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used to validate LUC data. Thus, in addition to producing 
more LUC datasets, more information is required. Users 
must be able to find out more about the most appropriate 
datasets for their field of study, and the general uncertainties 
and limitations of each one. They should also be informed 
about the methods that can be used to assess the specific 
utility and uncertainties of this data for their line of research. 


2 Land Use versus Land Cover 


Although Land Use and Land Cover are often combined, for 
example, in references to LUC maps and information, they 
in fact have quite separate meanings. Many authors have 
proposed complementary definitions (Di Gregorio and Jan- 
sen 1998; Campbell and Wynne 2011; Giri 2016a; Wulder 
et al. 2018) and the European directive INSPIRE, which 
establishes an Infrastructure for Spatial Information in the 
European Community, also includes a definition of each 
term (see text box below). On the basis of these various 
sources, we have opted for the following definitions. 


Directive INSPIRE (2007/2/EC) 


Land Cover: Physical and biological cover of the 
earth's surface including artificial surfaces, agricultural 
areas, forests, (semi-)natural areas, wetlands, water 
bodies. 


Land Use: Territory characterised according to its 
current and future planned functional dimension or 
socio-economic purpose (e.g. residential, industrial, 
commercial, agricultural, forestry, recreational). 


Land cover refers to the Earth's biophysical covers. Areas 
without a specific cover, such as areas of bare rock or bare 
soil, are also regarded as land covers. By contrast, land use 
refers to the activities that humans carry out on the Earth's 
surface or on a specific land cover. 

A land cover can have one or multiple uses, or even none. 
An artificial surface could be used to host people (e.g. res- 
idential area), production (e.g. industrial area) or leisure 
activities (e.g. sports facilities). In maps at coarser scales, 
this artificial surface can host all these uses together. For 
example, an urban area is an artificial cover which has 
multiple uses. Bare rock, on the other hand, often hosts no 
land use of any kind. 

A specific land use can also be associated with multiple 
land covers at the same time. An airport is a land use that is 
usually associated with several artificial covers, such as 
buildings, roads and runways, and also with vegetation 
covers, like grassland. 
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Whereas land covers are usually visible in aerial or 
satellite images, land uses are more difficult to distinguish. 
For instance, a building could have multiple uses: apart- 
ments, offices, industrial plants, sports facilities, etc. Some- 
times the land use can be deduced from contextual 
information in the image, but, in most cases, additional 
information is required. This makes map production more 
difficult and expensive. As a result, most maps only provide 
information about land covers. In other cases, they focus on 
the land use of certain specific covers, such as artificial or 
agricultural areas, so providing both Land Use and Land 
Cover (LUC) data. This is why in LUC science, we gener- 
ally talk about Land Use and Land Cover information, as the 
two aspects tend to be combined within the same datasets. 


3 Land Use and Land Cover Mapping: 
A History 


Some information on Land Use and Land Cover was 
available prior to the advent of remote sensing instruments 
(Campbell 1983). However, it was the appearance of aerial 
and, above all, satellite images that promoted the production 
of systematic LUC maps at regional, continental and global 
scales (Loveland 2016). 

Before the emergence of aeroplanes and satellites, the 
main method for map production was ground survey (Wallis 
1981; Fuller et al. 1994; Crone 2000). This was a 
time-consuming, laborious process that made systematic 
mapping of vast territories a difficult task. However, various 
important projects to map national territories were carried 
out in the eighteenth and nineteenth centuries without the 
use of aerial imagery (Collier 20092). Most of these projects 
involved topographic or cadastral maps, like the first French 
topographic survey finished in 1793, the French Napoleonic 
cadastre which began in 1807 or the Austrian cadastral 
survey launched in 1762 (Collier 2009a; Rochel et al. 2017). 
There are also striking examples of systematic exercises to 
map LUC information, such as the Land Utilization Survey 
of Great Britain, conducted from 1931 to 1938 (Campbell 
1983). Nonetheless, the general rule was for land use 
information to be presented as part of other maps with more 
general purposes (e.g. topographic, cadastral maps) or a very 
thematic approach (e.g. agricultural uses and production) 
(Campbell 1983). 

With the advent of aerial imagery and, later, satellite 
imagery, mappers obtained a view of the Earth's surface 
from the top of the atmosphere or from space. Mapping 
became easier and cheaper (Fuller et al. 1994). Instead of 
going out to the field to collect information, mappers could 
photointerpret and extract most of the features on the Earth's 
surface from the imagery, including land uses and covers. 
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Information collected in the field was still required to vali- 
date what was photointerpreted and to include some extra 
information that was not discernible in the image (Steiner 
1965; Campbell 1983). However, these tasks were less 
time-consuming and demanding than the original ground 
survey activities. 

Aerial images became increasingly common from the 
beginning of the twentieth century, with the development of 
the aeroplane industry within the context of the two World 
Wars (Collier 2009b). Most nations started or boosted 
ambitious national mapping programmes for strategic or 
economic purposes. Many national topographic or cadastral 
mapping projects were completed during this period (Collier 
20092). Some pioneer land use mapping projects were also 
launched at that time, such as the Michigan Land Economic 
Survey in the early 1920s and the Rural Land Classification 
Survey conducted by the Tennessee Valley Authority, which 
began in the 1930s (Steiner 1965). There was even a plan to 
create the first global land use map, with the foundation of a 
World Land Use Commission in 1949 and the mapping of 
different test areas in the 1950s and 1960s (Campbell 1983). 
However, mapping was still costly and very 
time-consuming. Although much easier than before, pho- 
tointerpretation was a manual task carried out using rudi- 
mentary tools that required a great deal of time and effort 
(Steiner 1965; Campbell 1983). 

The launch of the first satellite into space in 1957 proved 
a turning point in the history of LUC mapping (Emery and 
Camps 2017). Satellites provide a periodic imagery coverage 
of the Earth's surface. Once satellites started to provide 
images of the Earth, a homogeneous, cheap mosaic of the 
entire surface of the Earth soon became available (Morain 
1998; Chuvieco 2016). 

Satellites record the reflectance of the Earth's surface in 
different regions of the electromagnetic spectrum. The 
reflectance curve for each land cover can be independently 
characterized and defined (Chuvieco 2016; Emery and 
Camps 2017). In this way, satellite imagery gives mappers 
the information they need to draw the land covers on the 
Earth's surface automatically, so reducing the need for 
photointerpretation or human intervention in the process 
(Campbell and Wynne 2011; Chuvieco 2016). Nonetheless, 
the mapping of LUC covers from imagery reflectance has 
various important issues that can result in uncertainty and 
errors. One land cover can present several different spectral 
responses due to variations in vegetation density and phe- 
nology. Different land covers can also present a similar 
spectral response. This problem, known as spectral confu- 
sion, is critical in diverse and complex landscapes and can 
lead to large numbers of classification errors. 

Despite these limitations, the availability of satellite 
imagery and the ease with which land cover information 
could be obtained from them boosted the production of land 
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cover maps, which until then had been relatively rare 
(Comber 2008). Whereas most of the LUC information 
available in the pre-satellite era had been focused above all 
on land use, from then onwards, maps focusing on land 
cover or on a mixture of land cover and land use became 
predominant (Fisher and Unwin 2005; Comber 2008). 

Manual photointerpretation was still common in the early 
years of satellite remote sensing (Campbell 1983). It bene- 
fited from computer-assisted procedures, such as on-screen 
digitalization. However, it was progressively replaced by 
digital procedures with the development of powerful com- 
puters and the improvement of classification and image 
treatment methods (Loveland 2016). Nonetheless, even 
today manual photointerpretation still plays an important 
role in the production of LUC maps. Recent examples of 
Land Use Cover mapping over large areas using visual 
interpretation include maps of Europe (CORINE Land 
Cover; see Feranec et al. (2007)), Africa (AFRICOVER; see 
Di Gregorio and Latham (2003); Fritz et al. (2015)) and 
China (Zhang et al. 2014). 

As LUC mapping became easier, cheaper and quicker, 
many institutions, scientists and other users began producing 
LUC datasets at all the different scales (Grekousis et al. 
2015; Loveland 2016). Initial efforts were mainly focused on 
regional and national scales (Loveland 2016). However, the 
appearance of the first satellites with sensors providing free 
imagery covering the whole Earth at coarse resolutions 
allowed the first global LUC datasets to be developed 
(Congalton et al. 2014; Mora et al. 2014; Grekousis et al. 
2015). 

The AVHRR sensor on board the NOAA weather satel- 
lites launched in 1978 (Campbell and Wynne 2011), and the 
VEGETATION sensor, installed in the SPOT satellite in 
1998 (Gutman et al. 2012a), provided the first sources of 
satellite imagery for global mapping exercises (Congalton 
et al. 2014; Gong et al. 2016). Landsat, which was first 
launched in 1972, provided the first source of satellite ima- 
gery at medium spatial resolutions, which could be used for 
LUC mapping at regional and local scales (Belward and 
Skøien 2015). 

Since then, LUC mapping practice has been developed in 
parallel with the launch of new satellites and the increasing 
improvement in their spatial and spectral resolutions (Bel- 
ward and Skøien 2015). This process has also been spurred 
by the appearance and consolidation of public and private 
initiatives focusing on Earth Observation and LUC moni- 
toring (Herold et al. 2016; Wulder et al. 2018). Although 
many such organizations now exist, perhaps the most 
important are the United States Geological Survey (USGS) 
and the European Space Agency (ESA). 

The key role played by the USGS is undeniable. It 
authored the first research laying down the foundations of 
modern LUC mapping (Anderson et al. 1976; Gutman et al. 
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2008) and is also responsible for some of the most important 
Earth-monitoring projects today (Barber 2019; Szantoi et al. 
2020). The ESA has also played an important role, espe- 
cially recently after the launch of the Copernicus programme 
with the support of the European Commission (Szantoi et al. 
2020). The constellation of Sentinel satellites and the 
Copernicus land monitoring products, produced by the 
European Environmental Agency (EEA) and the Joint 
Research Centre (JRC), have enabled important advances in 
the production of detailed, high-quality LUC information 
that is updated periodically (Manakos and Braun 2014; 
Grekousis et al. 2015; Herold et al. 2016). 

Users now have more information available than ever 
(Belward and Skeien 2015; Grekousis et al. 2015; Giri 
2016a). Many LUC products have been developed and are 
ready to use, with abundant, detailed documentation about 
their characteristics (Grekousis et al. 2015; Diogo and 
Koomen 2016). There are numerous sources of satellite 
imagery, some of which are pre-treated and are available free 
of charge (Belward and Skøien 2015). Many methods have 
been developed for image processing and LUC mapping, 
such as classification algorithms (Bruzzone and Demir 2014; 
Yu et al. 2014; Khatami et al. 2016). Many methods and 
techniques have also been proposed for assessing the validly 
of LUC information (Strahler et al. 2006; Stehman and 
Foody 2019). Most of these methods and techniques are 
available on widely used software and are readily accessible 
to any user (Bastin et al. 2013; Mas et al. 2014b; Brovelli 
et al. 2018). All this has encouraged research into the pro- 
duction of LUC information and has widely extended its use, 
which has also led to an increase in published research on 
the topic, especially in the last 25 years (Yu et al. 2014). 


4 Uses of LUC Data 


The importance and utility of Land Use and Land Cover 
information is beyond doubt. LUC data is a valuable source 
of information for scientists (Bontemps et al. 2012; Manakos 
and Braun 2014). It gives them a better understanding of the 
interactions between societies and the environment (Lu et al. 
2004), an aspect of special interest for many social sciences 
such as geography or economics (Geoghegan 1998; Green 
et al. 2005). LUC data can also be used to monitor a range of 
different natural and environmental processes (e.g. hydro- 
logical, meteorological...), a question of great interest for 
many natural sciences (Rindfuss et al. 2004). 

Policymakers also need LUC data for proper resource 
management and to help them deal with many of the chal- 
lenges facing society today (Szantoi et al. 2020). It allows 
them to understand where land resources are located and 
how and when they change (Strand 2013; Thackway et al. 
2013). 


D. García-Álvarez et al. 


Campbell (1983) reviewed some of the applications of 
LUC data in policymaking in the USA at different scales. He 
found that “almost all governmental units have a continuing 
requirement to create and implement laws and policies that 
directly or indirectly involve existing or future land use". 
Local administrations need land use information for spatial 
planning. Regional and national governments may require 
LUC information for water management, flood control or in 
the design and assessment of environmental policies. At the 
international level, LUC data provides important evidence 
on which to base decisions regarding many of the global 
challenges facing society today. 

Most of the current global agendas refer to policy 
objectives involving Land Use and Land Cover. They play a 
direct role in 7 out of 17 UN Sustainable Development Goals 
(SDGs), and in the UN Framework Convention on Climate 
Change (UNFCCC), the Convention on Biological Diver- 
sity, the UN Convention to Combat Desertification 
(UNCCD) and the Ramsar Convention on Wetlands (Szantoi 
et al. 2020). LUC data is required to monitor many of the 
targets or actions proposed in these agreements, so empha- 
sizing the need for global LUC maps (Diogo and Koomen 
2016). 

The Group on Earth Observations (GEO) has defined 
eight Social Benefit Areas (SBAs) in which Earth observa- 
tions, including LUC data, provide useful evidence in sup- 
port of policymaking. ' They are biodiversity and ecosystem 
sustainability, disaster resilience, energy and mineral 
resource management, food security and sustainable agri- 
culture, infrastructure and transportation management, pub- 
lic health surveillance, sustainable urban development and 
water resources management. Specifically, LUC data can 
help, among other things, to characterize the land for disease 
control; monitor fires; assess the potential of land for biofuel 
production and wind or hydropower generation; and assess 
the role of LUC changes in the dynamics of hydrological 
systems and vegetation (Giri 2016b). 

Among scientists, LUC maps are frequently used as a 
basis for modelling exercises (Tsendbazar et al. 2015; Her- 
old et al. 2016). At a global scale, climate change models 
require global LUC maps (Sophie et al. 2011). At regional 
and local scales, land use and cover change models have 
emerged as valuable tools for policy support (Van Delden 
et al. 2011; White et al. 2015). These models are built on 
LUC datasets (Sohl and Sleeter 2012). 

LUC information is also used for many other research 
activities, most of them related to the different policy fields 
mentioned above. In recent years, it has been applied, for 
example, in studies analysing habitat distribution and 
ecosystem services (Jacob et al. 2003; Brown 2013), spatial 
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patterns of biodiversity (Zimmermann et al. 2010; Tuanmu 
and Jetz 2014), and ecosystem status and biogeochemical 
cycling (Johnson and Patil 1998; Lawrence et al. 2012), etc. 
A wide variety of processes are also studied using LUC data. 
Bielecka (2019) review some of the most common processes 
analysed through the CORINE Land Cover database. These 
include agricultural abandonment, urbanization, afforesta- 
tion, deforestation, landscape fragmentation, etc. 


5 Land Change Science 


Although LUC information is employed for manifold pur- 
poses, the field taking most advantage of this data is Land 
Use and Land Cover Change (LUCC) analysis (Feranec 
et al. 2007; Verburg et al. 2009; Bielecka 2019). LUCC 
analysis is the study of the changes in the land uses and 
covers on the Earth's surface, and their causes and conse- 
quences (Moran et al. 2012). LUCC is not usually studied as 
an end in itself, and the focus is normally on understanding 
its impact on a range of other natural or human-induced 
processes (Gutman et al. 2012a). Many of them have already 
been mentioned when explaining the general utility of LUC 
data. 

LUC change analyses are widely used in climate change 
studies (Sophie et al. 2011), the study of hydrological sys- 
tems (Carlson and Traci Arthur 2000; Cuo et al. 2009), 
weather conditions (Marshall et al. 2004), soil erosion 
(Cebecauer and Hofierka 2008), loss of biodiversity (Cebe- 
cauer and Hofierka 2008), as well as in research into 
ecosystem services (Hu et al. 2008) or animal habitats 
(Lawler et al. 2004). The utility of LUC data increases when 
historical information is available, as it allows us to track 
LUC changes over time (Verburg et al. 2011; García- 
Álvarez and Camacho Olmedo 2017). 

The importance of LUCC studies has led to the emer- 
gence of a specialist field called Land Change Science 
(Gutman et al. 2012a; Turner 2017), which is also referred to 
as Land Use Science or Land System Science (Müller and 
Munroe 2014). This is defined as a “transdisciplinary field" 
that "seeks to understand the dynamics of land cover and 
land use as a coupled human-environment system to address 
theory, concepts, models, and applications relevant to 
environmental and societal problems, including the inter- 
section of the two" (Turner et al. 2007). One of its hallmarks 
is the integration of natural and social sciences via a holistic 
approach (Rindfuss et al. 2004; Gutman et al. 2012a). Land 
Change Science now has its own specialists, who work at the 
confluence between these fields of knowledge (Moran et al. 
2012; Müller and Munroe 2014). 

Land Change scientists are responsible for monitoring 
LUC change, understanding it and modelling for the future, 
so obtaining knowledge and evidence that may be useful for 
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policymaking (Turner et al. 2007). Land Change is part of 
the wider field of research addressing Global Environmental 
Change, for which historical series of LUC data are required 
(Turner et al. 2007; Janetos 2012). This is why Land Change 
Science has emerged in parallel to the growth in remote 
sensing observation and the appearance of the first time 
series of Earth observation data (Moran et al. 2012; Turner 
2017). 

Many international programmes and organizations have 
stressed the importance of LUCC and Land Change Science 
(Giri 2016b). Turner (2017) claims that the science first 
originated in the joint programme on LUCC funded by the 
International Geosphere Biosphere Program (IGBP) and the 
International Human Dimensions Programme (IHDP). Other 
programmes that have emphasized the importance of LUCC 
studies include the U.S. Climate Change Science Program, 
the Global Land Project and the Group on Earth Observa- 
tions (GEO) and the United States Global Change Research 
Program (USGCRP) (Gutman et al. 2012b; Moran et al. 
2012). Some of these programmes are specifically focused 
on LUCC as a specialist interest, lying at the heart of their 
activities. These include the Land Cover and Land Use 
Change (LCLUC) programme run by NASA and the Global 
Observation of Forest and Land Cover Dynamics 
(GOFC-GOLD) programme (Gutman et al. 2012b). 


6 Land Use and Land Cover Change 
Modelling 


As previously noted, Land Change Science is not only a 
question of analysing and understanding LUC changes, but 
it also seeks to model them in the near future (Gutman et al. 
2012a; Turner 2017). Once we have understood what has 
changed, where it has changed, why it has changed (drivers 
or causes), how it has changed and what the consequences 
are, we can then take a step further and try to understand 
how different change trends can affect human-natural 
ecosystems. This is especially useful for policymaking. By 
evaluating different change scenarios, we can understand 
what the future may look like and what we can do to put the 
policy objectives we are seeking into practice (Oxley et al. 
2002; Soares-Filho et al. 2006; Escobar et al. 2018). 

Land Use and Land Cover Change Modelling (LUCCM) 
is about understanding the LUC dynamics at work within a 
given Earth system and modelling their future evolution 
(Verburg et al. 2004; Paegelow and Camacho Olmedo 
2008). To understand these dynamics, we need to study how 
the system has changed in the past and analyse the processes 
that gave rise to these changes (Plata Rocha 2010; Toro 
Balbotin 2014). By studying these processes in detail, we 
can identify the drivers behind the changes taking place 
(Bürgi et al. 2005; Kolb et al. 2013). Once we know what 
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changes are occurring and why, we can conceptualize this 
information and translate it into modelling terms. 

Models allow us to play around with the system we are 
studying so as to predict how different policies affect LUC 
and the changes they may cause (Van Delden et al. 2011). 
Models also help us understand how these changes may 
evolve in the future under different socio-economic condi- 
tions (Antoni et al. 2018). At a more modest level, LUCC 
models also enable us to study and analyse these systems in 
detail, so as to obtain a more in-depth understanding of them 
(Hewitt et al. 2014). 

LUC maps are the main input for LUCC models (Sohl 
and Sleeter 2012; Grinblat et al. 2016), forming the base on 
which all processes are conceptualized (García-Álvarez et al. 
2019b) LUC maps conceptualize the landscape to be 
modelled: they present the LUC categories into which the 
landscape is divided and determine the spatial detail of the 
model (Conway 2009; Garcia-Alvarez et al. 20192). They 
are also often used as a reference for studying LUC changes 
in the past (Burnicki et al. 2010) and for validating LUCC 
models (Van Vliet et al. 2016). 

Many types of LUCC models are available today 
(National Research Council 2014). Although there is no 
standard, globally accepted classification, we can broadly 
distinguish between process and pattern-based LUCC mod- 
els (Brown et al. 2013). The latter assume that changes in the 
landscape pattern are the result of the processes and 
dynamics taking place, and that each pattern is a conse- 
quence of a specific process (Mas et al. 2014a). These 
models simulate the pattern and its changes. They are 
therefore heavily reliant on time series of LUC maps and the 
changes they show. 

Process-based models simulate the processes taking 
place, rather than the pattern (O'Sullivan and Perry 2013). 
There are different kinds of process-based models, with 
agent-based LUCC models gaining increasing popularity. 
These models simulate the behaviour of the agents or actors 
that take part in the system being modelled and their inter- 
actions (Crooks and Heppenstall 2012). These agents cause 
the processes taking place on the ground and the changes in 
the landscape pattern. Although important, LUC maps do 
not play the same key role in these models as they do in 
pattern-based models, as most of the parameters used in 
process-based models are inferred from other sources (Mas 
et al. 20142). 

LUCC models can also be classified according to the 
scale of analysis, their stochastic or deterministic nature, the 
type of scenarios they can produce and the techniques and 
methods they apply (García-Álvarez 2018a). For example, 
some models include Markov chains to estimate the quantity 
of simulated change in the future (Sang et al. 2011; Eastman 
and Toledano 2018). These are usually calculated on the 
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basis of the changes that took place between two LUC maps 
in the past (Sinha and Kimar 2013; Mas et al. 2014a), so 
increasing the importance of LUC data in the modelling 
exercise. 

Modelling exercises normally consist of four main pha- 
ses: calibration, simulation, validation and the proposal of 
scenarios (Camacho Olmedo et al. 2018), although other 
phase-based structures have also been proposed. In almost 
all cases, researchers differentiate between the calibration 
and the validation phase (Pontius Jr. et al. 2004; Gallardo 
2014; Van Vliet et al. 2016). Nonetheless, some studies omit 
the validation stage, choosing solely to explore the modelled 
system and its behaviour. 

Calibration refers to the setting-up and parametrization of 
the model (Clarke 2004; Mas et al. 2018). The users define 
the objectives of the exercise, and the data and model to be 
used. They then parametrize the model in line with their 
understanding of the simulated system. After the initial 
results are obtained, the model is adjusted to obtain the best 
possible results (Van Vliet et al. 2016). Once the model is 
fully calibrated and a simulation has been obtained, this must 
be validated by comparing it with reference data that were 
not used earlier on in the modelling exercise (Pontius Jr. and 
Malanson 2005; Paegelow and Camacho Olmedo 2008). 

The methods and techniques used for calibration are 
similar to, if not the same as, those used in the validation 
phase (Mas et al. 2018). In the calibration phase, the results 
obtained from the model are compared with reference data so 
as to obtain a model that properly simulates the system being 
studied (Van Vliet et al. 2016). The model is then validated 
with independent data sources, not used in the calibration 
phase (Pontius Jr. and Malanson 2005; Van Vliet et al. 2011). 
Thus, whereas calibration fits the model to the reference data, 
validation makes sure that there is a good fit over time and not 
just for the date of the reference map. In this way, it ensures 
that the processes that explain the changes in the system 
being studied were correctly modelled. 


7 Uncertainty and Validation 


The increased availability of satellite and aerial imagery and 
the development of new methods and techniques for image 
processing and classification has enabled the production of 
an increasing number of LUC maps and time series of LUC 
maps at all scales (Yu et al. 2014; Grekousis et al. 2015; Giri 
20162). The same trend can be observed in the application of 
LUCC models, which has become very common as a result 
of easy access to LUC maps and LUCC modelling software 
(Sohl and Sleeter 2012; Ferchichi et al. 2017). 

With the increasing production and use of LUC maps and 
LUCC models, more attention has been paid to the 
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uncertainty and limitations of these data and analyses (Yeh 
and Li 2006; Krüger 2016; Loveland 2016; Ferchichi et al. 
2017; García-Álvarez et al. 2019b). Uncertainty can be 
defined as “the lack or the degree of certainty about any data 
or geospatial analysis due to the difference between reality 
and its representation through geospatial data or tools" 
(García-Álvarez et al. 2019b). Understanding how different 
these maps and exercises are from real landscapes and pro- 
cesses and, therefore, how reliable they are is essential. This 
is the only way of knowing how accurate the information we 
obtain from these maps and analyses is and to what extent it 
can be used as a basis for taking policy decisions. 

It is important to realize that all spatial data and analyses 
contain some degree of uncertainty (Longley et al. 2011). 
They are an abstraction and simplification of real landscapes 
and processes (Comber et al. 2005; Devillers and Jeansoulin 
2006). This means that the maps and models are themselves 
just conceptualizations of different processes and features of 
the Earth. When we conceptualize a landscape on a map, 
what we are actually doing is simplifying it to obtain ele- 
ments with which we can work and experiment. 

In the case of LUC maps, the complexity and variety of 
real landscapes is normally translated into a given set of 
categories (Di Gregorio and Jansen 1998; Herold and Di 
Gregorio 2012). Land Use and Land Covers do not always 
fit into a precise, clear-cut classification, as they show 
heterogeneous, mixed patterns that cannot be easily classi- 
fied within a specific category (Di Gregorio and Jansen 
1998; Villa et al. 2008). This makes it difficult to clearly 
define a particular land use and to distinguish it on the 
ground from all other land uses, establishing boundaries 
between them (Fassnacht et al. 2006). Some degree of 
uncertainty is therefore inevitable in the classification 
process. 

Mapping the full complexity of the Earth remains beyond 
human capacity, and even beyond existing computer capa- 
bilities (Unwin 1995; Murayama 2012). The smaller or 
coarser the scale, the greater the need for abstraction or 
simplification (Lloyd 2014). At whatever scale we work, we 
are capable of assimilating similar amounts of information. 
This means that at larger or finer scales we can add details, 
while at smaller or coarser scales we can only show the 
essentials. 

To understand the uncertainty and limitations of our data 
and analyses, we usually carry out uncertainty assessments 
(Van Asselt 2000; Jcgm 2008; Abreu and Ralha 2017; 
García-Álvarez et al. 2019b). In general, when we assess our 
data and analyses against reference data to evaluate the 
reliability of the information they provide, we are said to be 
validating the data or models (Fonte et al. 2015; Van Vliet 
et al. 2016). Validation can therefore be defined as the 
process by which we assess how certain or reliable a piece of 
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data or result is. This is done by comparing it against other 
data or information that we use as a reference and consider to 
be true. 

Although validation is already a common practice and 
there are many methods, strategies and reference data 
available for validating LUC maps and LUCC models, there 
is still a lot of room for improvement. In the case of LUCC 
maps, when Olofsson et al. (2013) carried out their review, 
up to 15% of the papers addressing land change with LUC 
maps did not include any proof of data validation. They also 
found that most of the reviewed papers did not include all 
the relevant information about the accuracy of the measured 
changes. The review carried out by Yu et al. (2014) pro- 
duced even less hopeful results: of 6771 papers including 
some type of LUC mapping exercise, only 1585 reported 
overall accuracy measures. Morales-Barquero et al. (2019) 
found that only 32% of the papers they reviewed provided a 
reproducible accuracy assessment and recommended that 
more statistically rigorous accuracy assessment practices be 
encouraged. 

In LUCCM, several authors emphasized the importance 
of analysing the uncertainty of the results, even when gen- 
eral validation exercises are carried out (Li and Wu 2006; 
Krüger 2016). In fact, Van Asselt (2000) criticized the 
widespread use of validation exercises in modelling as a tool 
“to sell the model as being scientifically credible", without 
proper discussion and analysis of the uncertainties and 
limitations of the modelling exercise. Sohl et al. (2016) 
consider the lack of information regarding uncertainty and 
the failure to quantify it as one of the reasons hampering the 
adoption of LUCC models in decision-making. 

The uncertainty of most of the available LUC datasets has 
been assessed in a large range of research studies (Grekousis 
et al. 2015; Tsendbazar 2016). However, these studies do not 
usually address all possible sources of uncertainty. Some 
limitations have been reported regarding the validation of 
specific areas and categories, which are heterogenous and, 
therefore, more difficult to map (Leyk et al. 2005; Fassnacht 
et al. 2006). The mapping accuracy of these categories and 
areas is not usually well characterized, as validation exer- 
cises only assess the general uncertainty or validity of the 
whole dataset (Prestele et al. 2016). Moreover, the validity 
of a specific dataset will depend on how it is used (Castilla 
and Hay 2007). An LUC map considered invalid for a 
specific type of study could be a reliable source of infor- 
mation for another study at another scale and with different 
aims. Maps like these are often described as “fit for use” or 
“fit for purpose” (Chrisman 2010). In addition, users often 
process the datasets in some way, so introducing sources of 
uncertainty that need to be evaluated (Nienkemper and Menz 
2016). When using a series of LUC maps, additional 
uncertainties may arise. As Olofsson et al. (2013) noted, 
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even when two independent maps are both very accurate, it 
is possible that the accuracy of the change map obtained by 
post-classification comparison will be low due to error 
propagation. 

Many users develop their own maps, given the increasing 
availability of free imagery and tools with which to process 
and classify the images easily (Belward and Skøien 2015; 
Yuan et al. 2020). They need to validate the maps that they 
produce both for general purposes and for the specific use 
for which they were designed (Chuvieco 2016). 
The LUCCM community also need to validate the results of 
their modelling exercises (Paegelow and Camacho Olmedo 
2008). To correctly interpret these results, they also need to 
understand the uncertainty of the LUC databases on which 
LUCC models are built (Prestele et al. 2016; Garcia-Alvarez 
2018b), given that input data and, specifically, input LUC 
maps, are considered one of the main sources of uncertainty 
in LUCCM (Verburg et al. 2013; Houet et al. 2015). 


8 Conclusions 


Many frequent users of LUC data and LUCC models are 
unaware of the latest developments in validation and 
uncertainty analysis of LUC data. It is also possible that they 
have limited knowledge of many of the datasets currently 
available for carrying out LUC exercises. 

Many of the recent advances in this field remain within 
closed scientific communities and are not disseminated 
among the wider LUC community outside the research 
arena. This book seeks to respond to their needs. It provides 
an overview of the state of the art on LUC datasets, 
including time series of LUC maps, and the tools and 
methods available for LUC map validation. It also presents 
and explains frequently used tools and guidelines for vali- 
dating the results produced by LUCC models. As many of 
the tools and techniques reviewed here are used in both LUC 
mapping and LUCC modelling validation exercises, in this 
book we address these two analyses together. 

A full validation exercise, characterizing all the uncer- 
tainties of a given dataset or model, is a complex task that 
requires a high level of expertise and a wide range of tools 
and strategies, each one addressing different sources of 
uncertainty. This is beyond the scope of this book. Here we 
focus on the quantitative validation of LUC maps and LUCC 
model results. For detailed information about qualitative 
analyses of uncertainty, we refer readers to more specialized 
bibliography, depending on the specific objectives of their 
research. Readers wishing to find out more about other 
important aspects of uncertainty and validation practice, 
such as uncertainty communication, are also referred to 
specific literature on this topic. 
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Further Reading 


Giri C (ed) (2012) Remote sensing of land use and land 
cover. Principles and applications. CRC Press. 


This is one of the main reference books on Land Use Cover 
mapping, focusing specifically on LUC mapping and anal- 
ysis. It offers an overview of the main concepts associated 
with LUC mapping and remote sensing and provides an 
introduction to this field, tracing its history. It also addresses 
the main methodological issues in relation to LUC mapping 
using remote sensing techniques, such as validation prac- 
tices, land cover change detection and image classification 
methods. In the third part, the book includes examples of 
regional LUC mapping and LUCC monitoring for different 
parts of the world. 


Manakos I, Braun M (2014) Land Use and Land Cover 
Mapping in Europe: Practices & Trends. Springer, Dor- 
drecht, Heidelberg, New York, London. 


Focused on Europe, this book is part of the reference bib- 
liography for LUC mapping and LUCC monitoring. It pro- 
vides a state of the art of LUC mapping globally, for Europe 
and at a national level for some of the European countries. 
Several chapters focus on remote sensing practices and 
methods for LUC mapping and LUCC detection. The book 
also has several introductory chapters on the role of remote 
sensing in the production of LUC information. Other chap- 
ters focus on the LUCC monitoring of processes relevant for 
policymaking. 


Camacho Olmedo MT, Paegelow M, Mas J-F, Escobar F 
(2018) Geomatic Approaches for Modeling Land Change 
Scenarios. Springer, Cham, Switzerland. 


This book provides an up-to-date review of LUCCM prac- 
tice. The first part describes each of the LUCCM phases: 
calibration, simulation, validation and proposal of scenarios. 
Each chapter also presents common methods and strategies, 
implemented in different modelling software, for setting up 
and running a LUCC modelling exercise. The book also 
includes a series of technical notes for many of these tools 
and techniques, as well as short presentations of standard 
LUCC modelling software that is currently available. 
Common applications of LUCC models for thematic anal- 
yses and methodological studies are also described. 


García-Álvarez D, Van Delden H, Camacho Olmedo MT, 
Paegelow M (2019) Uncertainty Challenge in Geospatial 
Analysis: An Approximation from the Land Use Cover 
Change Modelling Perspective. In: Koutsopoulos K, de 


Land Use Cover Mapping, Modelling and Validation. A Background 


29 


Miguel González R, Donert K (eds) Geospatial Challenges 
in the 21st Century. Springer, pp 289—314. 


This book chapter offers a synthetic overview of uncertainty 
in LUCCM. It includes a theoretical explanation of what 
uncertainty is and analyses its different dimensions. It also 
presents the different sources of uncertainty in LUCCM and 
reviews different strategies and methods for managing it. 


Gutman G, C. Janetos A, Cochrane COJ, et al. (2012) Land 
Change Science. Observing, Monitoring and Understanding 
Trajectories of Change on the Earth's Surface. Springer 
Netherlands, Dordrecht. 


Although outdated (it was initially edited in 2004), this book 
provides an introduction to Land Change Science and Land 
Use Cover Change analysis. The experience acquired with 
the International Land Use and Land Cover (LUCC) 
Research Programme of the NASA is the leitmotif of the 
book. It provides an overview of Land Change Science, 
defining its main concepts and presenting the main interna- 
tional initiatives in LUCC research. It also offers an over- 
view of the main processes of change analysed within the 
LUCC framework and its utility for policymaking and other 
fields. The book has various chapters focusing on method- 
ological issues, some of which refer to LUCCM. 


Belward AS, Skøien JO (2015) Who launched what, when 
and why; trends in global land-cover observation capacity 
from civilian earth observation satellites. ISPRS J Pho- 
togramm Remote Sens 103:115—128. https://doi.org/10. 
1016/j.isprsjprs.2014.03.009 


This paper offers an overview of the history of civilian earth 
observation satellite missions that produce information that 
can be used in LUC mapping. It describes various different 
space missions and reflects on how useful they have been for 
the LUC community. 
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Abstract 


This chapter offers a general overview of the available tools 
and strategies for validating Land Use Cover (LUC) data— 
specifically LUC maps—and Land Use Cover Change 
Modelling (LUCCM) exercises. We give readers some 
guidelines according to the type of maps they want to 
validate: single LUC maps (Sect. 3), time series of LUC 
maps (Sect. 4) or the results of LUCCM exercises 
(Sect. 5). Despite the fact that some of the available 
methods are applicable to all these maps, each type of 
validation exercise has its own particularities which must 
be taken into account. Each section of this chapter starts 
with a brief introduction about the specific type of maps 
(single, time series or modelling exercises) and the 
reference data needed to validate them. We also present 
the validation methods/functions and the corresponding 
exercises developed in Part III of this book. To this end, we 
address, in this order, the tools for validating Land 
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Use Cover data based on basic and Multiple-Resolution 
Cross-Tabulation (see chapter “Basic and Multiple- 
Resolution Cross Tabulation to Validate Land Use Cover 
Maps”), metrics based on the Cross-Tabulation matrix (see 
chapter “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps"), Pontius Jr. methods 
based on the Cross-Tabulation matrix (see chapter "Pontius 
Jr. Methods Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps"), validation practices 
with soft maps produced by Land Use Cover models (see 
chapter “Validation of Soft Maps Produced by a Land Use 
Cover Change Model"), spatial metrics (see chapter 
“Spatial Metrics to Validate Land Use Cover Maps’), 
advanced pattern analysis (see chapter “Advanced Pattern 
Analysis to Validate Land Use Cover Maps") and 
geographically weighted methods (see chapter “Geogra- 
phically Weighted Methods to Validate Land Use Cover 
Maps"). 


Keywords 


Land Use Cover * Land Use Cover Change Modelling 
exercises * Validation 


1 Introduction 


Validation is a required step prior to the effective use of any 
Land Use Cover (LUC) dataset or of the results of a Land 
Use Cover Change Modelling (LUCCM) exercise. We need 
to understand to what extent these datasets and results are 
uncertain in order to be able to assess the limits that these 
uncertainties may impose on the conclusions of our analyses 
and studies. 

There are many methods, tools and strategies currently 
available for validating LUC data and LUCCM exercises. 
However, comprehensive guidelines providing users with 
clear instructions and recommendations about how to carry 


35 


D. García-Álvarez et al. (eds.), Land Use Cover Datasets and Validation Tools, 


https://doi.org/10.1007/978-3-030-90998-7 3 


36 


out this validation are scarce. Olofsson et al. (2013, 2014) 
review the validation of land change maps and offer a series 
of recommendations as to how to perform a credible scien- 
tific validation, accepting that other recommendations or 
good practice guidelines could be equally valid and perhaps 
even more so. Paegelow et al. (2014, 2018) propose a variety 
of validation techniques and error analysis which can be 
used to validate different LUCCM exercises. 

In this chapter, we aim to provide readers with a general 
overview of the available tools and strategies for validating 
LUC data—specifically LUC maps—and LUCCM exer- 
cises. We give readers different guidelines according to the 
type of maps they want to validate: single LUC maps 
(Sect. 3), time series of LUC maps (Sect. 4) and results of 
LUCCM exercises (Sect. 5). Although some of the available 
methods and tools can be applied to all these maps, each 
type of validation exercise has its own specific aspects that 
users must bear in mind. For example, the results of 
LUCCM exercises include soft and hard LUC maps. The 
hard outputs of a model—hard maps—are very similar to 
input LUC maps, while the soft outputs—soft maps—are 
continuous and ranked. We therefore also present some 
validation methods that focus specifically on soft maps. 

Before presenting these validation methods and functions, 
it is important to make clear that visual inspection is an 
essential part of any validation exercise. It can provide a 
great deal of information about the uncertainties of the data 
being evaluated, which are not detected by the quantitative 
methods reviewed in this book. Visual inspection should be 
conducted during all validation exercises, at the beginning, 
at the end and throughout the entire process. 


2 Validation Methods/Functions 
and Exercises Presented in Part Ill of This 
Book 


This chapter is intended as a presentation of Part III of this 
book. Figure 1 shows the validation methods/functions and 
the corresponding exercises presented in the chapters and 
sections of Part III. With this in mind, in this chapter we 
address, in this order: the available tools for validating Land 
Use Cover data related with basic and Multiple Resolution 
Cross-Tabulation (see chapter “Basic and Multiple-Resolu- 
tion Cross Tabulation to Validate Land Use Cover Maps"), 
metrics derived from the Cross-Tabulation matrix (see 
chapter "Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps"), methods proposed by 
Pontius Jr. based on the Cross-Tabulation matrix (see 
chapter “Pontius Jr. Methods Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps"), validation 
practices with soft maps produced by Land Use Cover 
Change models (see chapter "Validation of Soft Maps 
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Produced by a Land Use Cover Change Model"), spatial 
metrics (see chapter "Spatial Metrics to Validate Land Use 
Cover Maps"), advanced pattern analysis (see chapter 
"Advanced Pattern Analysis to Validate Land Use Cover 
Maps") and geographically weighted methods (see chapter 
“Geographically Weighted Methods to Validate Land Use 
Cover Maps"). 

The exercises presented in Part III have been applied 
using the Quantum GIS (QGIS) software and R scripts. To 
homogenize the exercises across the different chapters, they 
have the same standard objectives: to validate a map (t;) 
against reference data/map (t;) (single LUC map); to vali- 
date a series of maps with two or more time points (to, tj. 
t2...) (LUC maps series/ LUC changes); and, for results from 
LUCCM exercise, to validate soft maps produced by the 
model against a reference map of changes (to — t;) (soft LUC 
maps), to validate a simulation (T;) against a reference map 
(tı) (single LUC map - hard LUC maps) and to validate 
simulated changes (tg — Tı) against a reference map of 
changes (to — tı) (LUC maps series / LUC changes — hard 
LUC maps). However, in certain specific cases, additions 
have been made to these standard titles. In addition to the 
applications of each method/function implemented in the 
practical exercises in this book, the cells shaded in grey in 
Fig. 1 indicate that the method has other potential applica- 
tions that are not described here. 


3 Validation of Single Land Use Cover Maps 


The validation of single LUC maps is the most widespread 
practice of all those addressed in this book. Foody (2002) 
concludes that there is no single universally acceptable 
measure of accuracy but rather a variety of indices, each 
sensitive to different features. Creating a single, all-purpose 
measure of classification accuracy would therefore seem an 
almost impossible goal. However, accuracy assessment must 
follow certain guidelines and principles in order to guarantee 
scientifically defensible assessment of map accuracy (Steh- 
man 1999; Stehman and Czaplewski 1998). 

Users have been validating their maps since the advent 
of digital remote sensing and the first classifications of 
digital imagery, as a means of assessing to what extent the 
classified images resemble the real LUC on the ground. 
Now, several decades later, the validation of single LUC 
maps is a very common practice, and although new meth- 
ods and tools have been developed over the years, the 
original ones remain popular. These are based above all on 
the comparison of the assessed LUC map with reference 
datasets through cross-tabulation (Foody 2002; Strahler 
et al. 2006). In recent years, the use of pattern analysis and 
other validation methods has become increasingly 
common. 
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LUCC modelling exercises 
Hard LUC maps 
LUC maps series/ LUC LUC maps series/ LUC 
| Single LUC map | changes Soft LUC maps | Single LUC map | changes 
Validation methods / functions To validate a map (t) | To validate a series of | To validate soft | To validate a simulation | To validate simulated 
li 1 rainst reference | maps with two or more | maps produced by Ty "inst a reference | changes (to — Ti, ainst à 
Book chapters Chapter sections para (t) Ai points (to, ty ts...) the madal against a vied p^ pied nAR P ecu (to 
reference map of =n) 
i —— ug c. FEM changes (to — ti) I 
Basic and Multiple 1. Basic Cross Tabulation | Exercise 1 Exercise 4 Exercise 2 | Exercise 3 
Resolution Cross | 2. Multiple-Resolution Cross Tabulation Exercise 1 | Exercise 2 Exercise 3 
| Tabulation 
1. Change statistics Exercise 1 Ji 
2. Areal and spatial agreement metrics | Exercise 1 | | Exercise 2 Exercise 3 
Metrics based on a Cross | 3. Kappa Indices | Exercise 1 | Exercises 2, 3 
Tabulation Matrix 4. Agreement between maps at global and Exercise 1 
stratum level 
5. Accuracy assessment statistics | Exercise 1 - IL ul 
1. Null Model Exercise 1 
2. LUCC Budget Exercise 1 
Pontius Jr. methods based | 3. Quantity and allocation disagreement Exercise 1 
on a Cross Tabulation | 4. Figure of Merit (FoM) and complementary Exercises 1,2,3 
matrix Producer's and User's accuracy | | 
5. Incidents and States Exercise 1 
6. Intensity analysis Exercise 1 
| 7. Flow matrix Exercise 1 br X o. 
1. Correlation | | Exercise 1 
Validation of soft maps | 2. Receiver Operating Characteristic (ROC) | | Exercise 1 
produced by a Land Use | 3. Difference in Potential (DiP) | | Exercises 1, 2 | n 
Cover Change model 4. Total uncertainty, quantity uncertainty, | Exercise 1? 
allocation uncertainty | 
| Spatial metrics 1. Spatial metrics | Exercise 1 | Exercises 4, 5, 6 | Exercise 2 Exercise 3 
Advanced pattern analysis 1. Map Curves ; | Exercise 1 Exercise 4 | Exercise 2 | Exercise 3 
2. Change on pattern borders Exercise 1 
3. Allocation distance error | Exercises 1,2 | 
Geographically weighted | 1. Overall, user's and producers accuracy | Exerase | 
methods through GWR 


1 Titles of exercises included in the corresponding sections 


2 š A 
* This validation technique is not calculated against a reference map of changes 


Fig. 1 Validation methods/functions and corresponding exercises presented in Part III of this book for single LUC maps, LUC maps series/LUC 
changes and LUCCM exercises. The grey cells highlight the possible applications of each method/function 


The reference datasets for validating single LUC maps 
may be obtained from different sources of LUC data. These 
can be classified into two main groups: ground samples and 
reference LUC maps. However, in the validation exercises, 
other reference spatial data can also be used, such as the raw 
imagery used in the classification process or the soft maps 
obtained as a result. 

The ground samples collected through field surveys pro- 
vide highly accurate, detailed data. However, this informa- 
tion is very expensive to obtain and fieldwork is not an option 
when working with large study areas. This is why most ref- 
erence LUC samples are obtained by photointerpretation or 
classification of satellite imagery. The data obtained via 
photointerpretation must be of higher quality that the data 
being validated. This usually involves careful interpretation 
of a set of samples using imagery with a higher spatial res- 
olution than the images used to create the map. Another 
option is photointerpretation of the same imagery used to 
obtain the dataset, applying a different workflow and methods 
or techniques that guarantee better quality. 

Those using these methods to obtain LUC samples for 
validation purposes should provide information about their 
accuracy or uncertainty. When obtaining reference data by 
field surveys or photointerpretation, users must take partic- 
ular care when selecting the sampling strategy they will 
apply during the collection of this information, as it can have 
an important impact on the results of the validation exercise 


and on their validity (see chapter “Visualization and Com- 
munication of LUC Data"). 

LUC maps can also be validated against other LUC maps. 
In these cases, the reference LUC map must have a higher 
spatial resolution and greater detail that the map being 
assessed. They must also be of proven quality, i.e. maps or 
datasets with verified accuracy and uncertainty. Although 
less precise, validation exercises carried out by comparing 
the evaluated map with other LUC maps are quick and very 
cheap, hence their popularity. This also allows a wider set of 
methods and techniques to be used compared to the possi- 
bilities offered by reference datasets other than maps. 

Users can also validate their LUC maps against additional 
sources of information other than reference datasets, in order 
to characterize the maps in more detail and gain a clearer 
picture of their uncertainty. Such sources include raw ima- 
gery, which is often used in the classification process, or the 
soft maps obtained from it, which are used to assess the 
characteristics of the pixels that make up each class. Raw 
imagery can be used to evaluate the reflectance value for all 
the pixels belonging to a particular class and how close it is 
to the reference reflectance value used in the classification 
process. When available, users can also compare each cat- 
egory pixel with soft maps showing the percentage of each 
pixel belonging to each of the LUC categories under con- 
sideration. Similar insights into the accuracy of LUC maps 
can be obtained by comparing them with continuous LUC 
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data (reference data), such as the Vegetation Continuous 
Fields (VCF) products. 

If we focus on validation tools (Fig. 1), the agreement 
between the reference data/map (tı) and the LUC map under 
evaluation (t;)—the two maps should have the same date 
t;—-can be assessed using the cross-tabulation matrix! (see 
Sect. | in chapter “Basic and Multiple-Resolution Cross 
Tabulation to Validate Land Use Cover Maps" ). This is also 
referred to in the literature as the confusion or error matrix, 
or as the contingency table. Cross tabulation is usually the 
first step in any validation exercise, as the raw matrix pro- 
vides plenty of information regarding the spatial agreement 
between the LUC map being validated and the reference 
dataset. 

In some cases, the level of agreement may vary at dif- 
ferent levels of spatial detail. For example, when spatially 
aggregated and simplified, the LUC map being evaluated 
may show more agreement with the reference dataset. The 
choice of spatial resolution is therefore a source of uncer- 
tainty. To account for this uncertainty, we can cross-tabulate 
the assessed and reference datasets at multiple spatial res- 
olutions (see Sect. 2 in chapter “Basic and 
Multiple-Resolution Cross Tabulation to Validate Land Use 
Cover Maps”), i.e. the original resolution and other coarser 
ones. 

Different metrics are calculated from the confusion matrix 
(see chapters “Metrics Based on a Cross-Tabulation Matrix 
to Validate Land Use Cover Maps” and “Pontius Jr. Meth- 
ods Based on a Cross-Tabulation Matrix to Validate Land 
Use Cover Maps"). These metrics summarize the agreement 
between reference and validated datasets in a single value 
and are therefore very easy to interpret. As a result, they 
have been widely used in LUC validation. 

The most common metrics are the accuracy assessment 
statistics (see Sect. 5 in chapter "Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") and the Kappa Indices (see Sect. 3 in chapter 
"Metrics Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps"). The accuracy assessment statistics 
are standard metrics that provide information about the 
similarity between two georeferenced data. They are 
obtained from the cross-tabulation matrix and enable the 
extraction of specific information contained in the matrix. 
They include, among others, the overall, producer's and 
user's accuracy metrics. They are usually supplied with the 
cross-tabulation matrix, providing extra information in 
addition to that provided by the matrix itself (e.g. category 
area adjusted by the error level, confidence intervals...). 


! The methods/functions presented in the corresponding chapters in 
Part III of this book are highlighted in bold. 
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Of all these metrics, the most commonly used in valida- 
tion exercises is probably Overall accuracy. There has been 
great debate in the literature about the threshold above which 
the Overall accuracy of a map can be considered acceptable. 
The 85% threshold proposed by Anderson (1971) was the 
common reference for many years and continues to be 
applied by a lot of users nowadays (Wulder et al. 2006; 
Foody 2008). However, there is no specific accuracy 
threshold regarded as valid for all study cases and datasets. 
The acceptable level of accuracy will depend on the intended 
application of the dataset and the characteristics of the area 
being mapped. As regards different scales and spatial reso- 
lution, we cannot compare the accuracy of global or 
supra-national LUC maps with that of regional and local 
ones, which are not subject to the same level of simplifica- 
tion or abstraction as the global or supra-national maps. 

The overall accuracy metric does not provide information 
about the accuracy at which each category on the LUC map 
is mapped. Important differences are often identified in terms 
of the relative accuracy of the different categories. 
Mixed LUC categories do not usually show the same 
accuracy as spectrally pure categories. At high levels of 
thematic detail, very similar LUC categories can be easily 
confused and will, therefore, have lower levels of accuracy. 
Users must take these differences at the category level into 
account and report the accuracy values for each category. 
The general approach for agreement between maps at 
global and stratum level may be useful to this end (see 
Sect. 4 in chapter "Metrics Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps"). Some authors 
talk specifically about Overall and Individual Spatial 
Agreement, proposing different metrics for these purposes 
(Yang et al. 2017; Islam et al. 2019) (see Areal and spatial 
agreement metrics in Sect. 2 in chapter “Metrics Based on 
a Cross-Tabulation Matrix to Validate Land Use Cover 
Maps"). 

It is also important to remember that the accuracy of a 
LUC map is not usually the same across the entire mapped 
area and considerable spatial variations are possible. The 
bigger the area being mapped, the more likely it is for there 
to be spatial differences in accuracy levels across the mapped 
area. The cross-tabulation matrix does not provide infor- 
mation about these spatial differences. When mapping large 
study areas made up of different, clearly distinguishable 
regions, each region can be validated independently, pro- 
ducing a specific cross-tabulation matrix in each case. The 
global analysis would cover the entire map, while specific 
areas of the map (e.g. a region, a municipality...) could also 
be analysed at the stratum level. 

Overall Accuracy is highly correlated with the Kappa 
Index (Olofsson et al. 2014), which explains why both 
metrics provide similar information. One difference is that 
Kappa takes into account the agreement expected by chance, 
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a factor that is not considered in Overall Accuracy. The 
Kappa Index (see Sect. 3 in chapter "Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") has been criticized by a range of authors, who claim 
that it can sometimes be misleading (Pontius and Millones 
2011; Olofsson et al. 2014). Moreover, standard indices such 
as overall, producer's and user's accuracy have the advan- 
tage that they can be interpreted as measures of the proba- 
bility of encountering pixels, patches, etc. that have been 
allocated to the correct category (Stehman 1997). 

The methods mentioned above do not employ fuzzy logic 
and, instead, apply a binary logic when calculating agree- 
ment, i.e. the two elements agree or don't agree. Partial 
agreements are not considered. However, there are some 
tools for calculating map agreement that incorporate fuzzy 
logic, such as the Fuzzy Kappa or the Fuzzy Kappa Simu- 
lation (Woodcock and Gopal 2000). 

Other metrics, similar to Kappa, have also been proposed. 
Usually they aim to outperform Kappa and correct some of 
its associated problems. These include, among others, the 
F-Score (Pérez-Hoyos et al. 2020), Scott's pi statistic (Gwet 
2002) and Krippendorff's o-coefficient (Kerr et al. 2015). 
These metrics are not widely used and they provide similar 
information to Kappa, which is why we do not recommend 
that they be used in a standard LUC validation exercise. 

Extensive research by Pontius Jr. has given rise to other 
metrics based on the cross-tabulation matrix which can be 
used to validate a single LUC map against a reference map 
(see chapter “Pontius Jr. Methods Based on a Cross-Tabu- 
lation Matrix to Validate Land Use Cover Maps"). Quantity 
& allocation disagreement (see Sect. 3 in chapter “Pontius 
Jr. Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps") (Pontius and Millones 2011) 
compares the agreement between maps regarding the pro- 
portions allocated to the different categories and regarding 
the way they are allocated, i.e. differences in the quantities 
allocated to each category and differences in their location. 
These metrics complement the cross-tabulation table, so 
enabling users to take full advantage of the information it 
provides. Quantity and Allocation disagreement is a very 
good method for validating a single map against a reference 
map (García-Álvarez and Camacho Olmedo 2017). 

Users can also specifically assess the pattern of the map 
they want to validate to find out how much its pattern 
coincides with that of the reference map. Pattern agreement 
can be assessed using Spatial metrics (see Sect. 1 in chapter 
"Spatial Metrics to Validate Land Use Cover Maps") and the 
Map Curves method (see Sect. 1 in chapter “Advanced 
Pattern Analysis to Validate Land Use Cover Maps"). Spa- 
tial metrics allow us to characterize different aspects of the 
map's pattern in detail, such as its fragmentation, the pro- 
portion allocated to each category, the complexity of the 
patches... (Botequilha et al. 2006; Forman 1995). Initially 
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developed within the field of landscape ecology, these 
metrics are also widely used for characterizing the pattern of 
categorical maps. For its part, Mapcurves (Hargrove et al. 
2006) provides a single value summarizing the pattern 
agreement between two maps. In both cases, we should 
always compare maps drawn at the same spatial and the- 
matic resolution, as any changes in resolution would 
severely alter the pattern of the map, so rendering the 
comparison uninformative. 

Geographic weighting methods (GWR) (see chapter 
“Geographically Weighted Methods to Validate Land Use 
Cover Maps") can also be used to study the spatial distri- 
bution of LUC accuracy measures. The overall, user's and 
producer's accuracy metrics mentioned above are derived 
from the cross-tabulation matrix and are therefore not spatial 
metrics, i.e. they provide overall information for the entire 
area, without assessing the spatial distribution of error and 
accuracy. The application of Overall, user's and pro- 
ducer's accuracy metrics through GWR (see Sect. 1 in 
chapter “Geographically Weighted Methods to Validate 
Land Use Cover Maps") can help the user to assess the 
suitability of the LUC data and to observe local variations in 
accuracy and error on the map (Comber 2013). In some 
cases, local assessments may be necessary because they can 
uncover possible clusters of errors in the LUC data. By 
adapting logistic Geographically Weighted Regression 
(GWR) (Brunsdon et al. 1996), the spatial variations in 
Boolean LUC (classified data) and fuzzy LUC (reference 
data) can be modelled, providing maps that show the dis- 
tribution of the overall, user’s and producer’s accuracy 
metrics. 


4 Validation of Land Use Cover Maps 
Series/Land Use Cover Changes 


There is no common practice or set of methods for validating 
or evaluating the uncertainty of a LUC map series with two 
or more time points (to, t1, to...). Most of the exercises for the 
validation of LUC data only refer to single LUC maps, 
without focusing specifically on the LUC change studied 
through a series of LUC maps. 

One of the facets that users most demand from LUC data 
is the ability to study and display LUC changes over time. 
We therefore need methods and tools to assess the uncer- 
tainty of the changes that are measured from LUC maps. It is 
worth noting that the individual accuracy of two LUC maps 
involved in a post-classification comparison offers few clues 
as to the accuracy of change, because the relation between 
the errors in the two maps is unknown. As pointed out by 
Olofsson et al. (2013), even when both maps are highly 
accurate, it is possible that the change map accuracy will be 
low and the estimated change area strongly biased. 
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One of the main limitations when it comes to validating 
LUC changes and LUC map series is the lack of reference 
data. We could obtain reference datasets via photointerpre- 
tation or field surveys. However, it is difficult to guess where 
the LUC changes will take place, as they may happen at 
different places and with different intensities and patterns 
over space and time. In addition, there is a clear lack of LUC 
map series showing accurate, validated LUC change that 
could be used as reference data. Another option would be to 
validate the LUC changes against other types of reference 
data. This could be done for example by comparing the LUC 
changes measured over a time series of LUC maps against 
the difference in reflectance between two satellite images for 
the same time period. This is because when LUC change 
takes place, there is a significant change in the reflectance 
value registered by the satellite capturing the images. 

Nevertheless, as commented earlier, the most common 
situation is that there are no reference datasets available. In 
these cases, the uncertainty of the LUC map series must be 
assessed by evaluating the consistency and the logic of the 
measured LUC change. The tools and techniques recom- 
mended here provide a great deal of information to the user. 
However, the final interpretation of the measured LUC 
change will be subjective, based on the user's expertise and 
understanding of the study area. In this situation, visual 
inspection can be very useful for quickly understanding 
many of the uncertainties in the time series of LUC maps 
that cannot be measured using quantitative metrics. This is 
why we recommend visual inspection as a first essential step 
prior to the validation of any LUC map or LUC modelling 
exercise. 

Users must be aware that LUC change usually represents 
a very small portion of the mapped area. For a specific, not 
very large landscape, we would only expect a few features to 
change over a short period of time. In addition, the same area 
would not normally be expected to be affected by various 
successive changes. On the contrary, when an area changes, 
the new land use or cover tends to remain unchanged over 
time. In addition, there are some LUC transitions that make 
less sense than others. For example, one would not expect an 
artificial area to change to vegetation or agricultural land. 
These general assumptions may be adapted in line with the 
particular characteristics of the study area and also within the 
context of each element being analysed. 

The same validation techniques reviewed above for single 
LUC maps (Sect. 3) can also be applied when comparing 
measured and reference changes or just for evaluating the 
consistency and logic of measured LUC change. However, 
some tools are specific to time series (Fig. 1). 

The cross-tabulation matrix (see Sect. | in chapter 
"Basic and Multiple-Resolution Cross Tabulation to Vali- 
date Land Use Cover Maps") is the tool that provides most 
information about the change happening between two LUC 
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maps. For a time series, we can compare each pair of LUC 
maps to find out the changes that take place at each date and 
the area they cover, for the map as a whole and at category 
level. We can summarize the main processes of change in 
our study area, such as, for example, the artificialization or 
deforestation rates for each time period. This gives us an 
overview of the change that has taken place over our map 
series and makes it easier to interpret some of the incon- 
sistencies in measured change. Some authors also propose 
making a summary of all the transitions taking place, asso- 
ciating some of them with a default degree of uncertainty 
(Gómez et al. 2016; Hao and Gen-Suo 2014). For example, a 
transition from artificial surfaces to agricultural areas is not 
expected and could therefore be assigned a high degree of 
uncertainty. 

Multi-resolution cross-tabulation (see Sect. 2 in chapter 
"Basic and Multiple-Resolution Cross Tabulation to Vali- 
date Land Use Cover Maps") offers a means of checking 
whether some of the errors, inconsistencies or uncertainties 
we detect at the original resolution are not detected at coarser 
resolutions. When this happens, the errors and inconsisten- 
cies probably arise due to the level of detail at which the 
dataset was created. 

The cross-tabulation matrix is an excellent source of 
information, which we can easily summarize using other 
tools and metrics. As commented in Sect. 3, Areal and 
spatial agreement metrics (see Sect. 2 in chapter “Metrics 
Based on a Cross-Tabulation Matrix to Validate Land Use 
Cover Maps") and Kappa Indices (see Sect. 3 in chapter 
"Metrics Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps") are used to assess the agreement 
between two maps. Despite their limitations, these metrics 
can be used to chart, in a generic way, the persistence or 
changes between two dates. If two maps in a series undergo 
the normal rate of change that we associate with any land- 
scape, the differences between them should be slight, which 
means that the Kappa and agreement metrics should reflect 
high levels of coincidence between the maps being 
compared. 

The Agreement between maps at global and stratum 
level (see Sect. 4 in chapter “Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") analysis could provide additional specific informa- 
tion about the agreement in a time series of LUC maps at 
whole map level, or for a given stratum, i.e. a smaller area or 
a specific LUC category. Accuracy assessment statistics 
can also be calculated for a LUC map series, either globally 
(see Sect. 5 in chapter “Metrics Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps") or locally 
(Sect. 1 in chapter “Geographically Weighted Methods to 
Validate Land Use Cover Maps"). For example, when the 
LUC map series is obtained using a base map that is pro- 
gressively updated, the first stage is to validate the base map 
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of the series using the same procedure described earlier for 
validating single LUC maps. Once this has been done, we 
can validate the changes against a reference dataset of 
changes through cross-tabulation, obtaining from the 
resulting table the overall, producer's and user's accuracy 
metrics. Pouliot and Latifovic (2013) coined the term Update 
Accuracy (UA) to refer to the accuracy of the measured 
changes. They refer to the accuracy of the base map as the 
Base Map Accuracy (BMA). They also propose a metric 
called Time Series Accuracy (TSA) as the mean accuracy of 
all the LUC maps that make up the series, individually 
validated through a specific reference LUC dataset for each 
case. 

Change statistics (see Sect. 1 in chapter “Metrics Based 
on a Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") (FAO 1995; Puyravaud 2003) are widely used to 
assess land use and cover changes. These indices measure, 
for example, relative change or rates of change and allow us 
to compare the change between regions of different sizes. 
These indices can be complemented by the change matrix 
obtained from cross-tabulation. They are calculated from the 
map series itself, rather than from the cross-tabulation 
matrix. 

Robert Gilmore Pontius Jr. has made major contributions 
to the family of validation techniques based on the 
cross-tabulation matrix (chapter “Pontius Jr. Methods Based 
on a Cross-Tabulation Matrix to Validate Land Use Cover 
Maps"). The LUCC budget (see Sect. 2 in chapter “Pontius 
Jr. Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps") (Pontius et al. 2004) provides more 
information about the changes that take place between pairs 
of maps. It differentiates between net and gross changes, 
therefore, allowing us to gain a clearer understanding of the 
transitions and swaps between categories, providing useful 
additional information to identify category confusion over 
time. Category confusion arises when the same area is 
mapped as different, albeit similar, categories at different 
points in time, when no change has actually taken place. 

Quantity and allocation disagreement (see Sect. 3 in 
chapter “Pontius Jr. Methods Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps") show, at overall 
and category level, differences between pairs of maps in 
terms of category proportions due to the different allocation 
of the categories. Few changes are expected in a time series 
of maps. This means that quantity and allocation disagree- 
ment should be low and should centre on the most dynamic 
categories. 

The number of incidents and states (see Sect. 5 in 
chapter “Pontius Jr. Methods Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps") (Pontius et al. 
2017) also provides information that can help identify errors. 
This technique allows us to identify those areas that are more 
dynamic than expected, i.e. those that change a lot over a 
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short period of time, always transitioning between the same 
categories. Intensity analysis (see Sect. 6 in chapter “Pon- 
tius Jr. Methods Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps") (Aldwaik and Pontius 
2012) compares the rates of LUC change between periods, 
categories, and transitions. Based on the assumption that a 
category or area is expected to change at similar levels of 
intensity over time, this analysis enables us to identify those 
categories that do not comply with this assumption. The 
Flow matrix (see Sect. 7 in chapter "Pontius Jr. Methods 
Based on a Cross-Tabulation Matrix to Validate Land Use 
Cover Maps") (Runfola and Pontius 2013) measures the 
instability of annual land use change over different time 
intervals, so as to identify anomalies relative to the amount 
of change over the whole time series. 

Spatial metrics (see Sect. 1 in chapter '") and Map 
curves (see Sect. 1 in chapter “Advanced Pattern Analysis 
to Validate Land Use Cover Maps”) enable us to charac- 
terize the pattern of each LUC map in the series. We do not 
expect the pattern of the map to vary significantly over the 
time period being analysed. This means that only smooth 
changes should be observed when comparing the spatial 
metrics for each of the periods analysed. 

Spatial metrics that specifically measure the areas that 
change between pairs of maps may also be useful. In the 
case of a pair of maps or a time series, the detection of 
change on pattern borders (see Sect. 2 in chapter 
“Advanced Pattern Analysis to Validate Land Use Cover 
Maps") (Paegelow et al. 2014) enables us to identify data 
errors resulting from different data sources, different classi- 
fiers or spectral responses. For example, the noise or error 
shown by a time series of LUC maps often arises due to 
border areas between categories being interpreted differently 
each year. Users can specifically analyse the changes that 
take place in these border patches, often elongated and less 
than 1 or 2 pixels wide, so helping them to identify potential 
errors. These patches can also be characterized through the 
calculation of spatial metrics. 


5 Validation of Land Use Cover Change 
Modelling Exercises 


Validating a LUCC modelling exercise is a complex task. In 
this case, we are not validating a single LUC map or a series 
of LUC maps, but a model application made up of multiple 
inputs, which interact to deliver new results. When validat- 
ing LUCC modelling exercises, users tend to focus exclu- 
sively on the validation of the model's hard maps, i.e. maps 
with a categorical legend similar to the input LUC maps 
(Camacho Olmedo et al. 2018). These hard maps are the 
main final output of any modelling exercise, but not the only 
one. To properly validate a LUCC modelling exercise we 
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should focus not only on the scenario generated by the 
model, but also on the other outputs and inputs. 

Given the nature of this book, we will be dealing 
exclusively with the validation of LUC maps associated with 
LUCC modelling exercises: input LUC maps, output soft 
LUC maps and output hard LUC maps. Users must bear in 
mind that other sources of data can be used in LUCC 
modelling exercises and can be validated via complementary 
methods. 

Modellers can begin a modelling exercise by evaluating 
the uncertainty of the input LUC maps used in the model and 
their changes according to the guidelines set out in Sects. 3 
and 4 above. This is because the quality of the input LUC 
maps can have a significant effect on the performance of the 
model. When setting up LUCC models, it is essential to 
understand the changes that take place in the set of input and 
reference maps. An assessment of the uncertainty of these 
LUC changes is therefore vital for determining and charac- 
terizing the uncertainty of the LUCC modelling exercise. 

In the following subsections, we present the validation 
tools for output LUC maps, i.e. the products obtained by the 
model, differentiating between soft and hard LUC maps. 


5.1 Soft LUC Maps 

Soft LUC maps, also referred to as suitability, change 
potential or change probability maps, are produced by the 
model to express the propensity to change over space, that is, 
the potential of each pixel to become a specific category in 
the future (Camacho Olmedo et al. 2018). Modellers can 
assess the internal behaviour and coherence of the model 
they are building by comparing the model's soft maps with 
the maps of simulated changes. They can also find out to 
what extent the changes simulated by the model coincide 
with the areas of highest potential in the respective maps for 
each modelled category. In addition, they can compare the 
soft maps obtained by different models and assess their level 
of agreement. 

Soft LUC maps are usually validated against a reference 
map of changes (to — t;), and there are various methods for 
carrying out this analysis (see chapter “Validation of Soft 
Maps Produced by a Land Use Cover Change Model"). The 
Pearson and Spearman correlation (see Sect. 1 in chapter 
"Validation of Soft Maps Produced by a Land Use Cover 
Change Model") is appropriate for a quick assessment of the 
soft map, by computing it against the map of observed 
change (Bonham-Carter 1994; Camacho Olmedo et al. 
2013). The Receiver Operating Characteristic (ROC) (see 
Sect. 2 in chapter "Validation of Soft Maps Produced by a 
Land Use Cover Change Model") (Pontius and Parmentier 
2014) is used to assess soft maps by comparing them with 
the observed binary event map. A highly predictive model 
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produces a soft map in which the highly ranked values 
coincide with the actual event. In soft maps, the Difference 
in Potential (DiP) proposed by Eastman et al. (2005) (see 
Sect. 3 in in chapter “Validation of Soft Maps Produced by a 
Land Use Cover Change Model") compares the relative 
weight of values allocated to changed areas, in other words 
the difference between the mean potential in the areas of 
change and the mean potential in the areas of no change 
(Pérez-Vega et al. 2012). 

In short, the previous three methods evaluate the rela- 
tionship between the observed changed area and the soft 
LUC map, assuming that a good model output allocates the 
highest change probability values to the areas that did 
actually change, and the lowest change probability values to 
the areas that did not change. Unlike the previous methods, 
the total uncertainty, quantity uncertainty and allocation 
uncertainty indices (see Sect. 4 in chapter "Validation of 
Soft Maps Produced by a Land Use Cover Change Model") 
(Krüger and Lakes 2016) are not calculated against a refer- 
ence map of changes, and instead estimate uncertainty by 
adding together misses and false alarms based on soft pre- 
diction score levels. 

In addition to these specific indices for soft LUC maps, 
validation can also be conducted after reclassifying the 
original soft maps, so transforming continuous, ranked maps 
(soft) into categorical maps (hard) (see Sects. 1 and 2 in 
chapter “Basic and Multiple-Resolution Cross Tabulation to 
Validate Land Use Cover Maps"). This preliminary step 
enables most of the validation tools presented in this chapter 
to be applied for this purpose. 


5.2 Hard LUC Maps 


The second output obtained by the model is the hard LUC 
map. Also known as prospective LUC maps, these are 
simulated LUC maps with an identical categorical legend to 
the input LUC maps (Camacho Olmedo et al. 2018). The 
hard maps must be validated in order to understand more 
about the behaviour of the model and how well it simulates 
changes. These maps provide a clearer picture of the char- 
acteristics of the simulated changes and how they resemble 
our reference data. 


5.2.1 Single LUC Maps 

The simulation (Tı) can only be validated against a single 
LUC map (t) if both maps correspond to the same year. This 
will also enable users to apply the panoply of tools presented 
in Sect. 3. The Accuracy assessment statistics, computed 
either globally (see Sect. 5 in chapter “Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") or locally (see Sect. 1 in chapter “Geographically 
Weighted Methods to Validate Land Use Cover Maps") 
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could also be applied to validate the simulation against other 
LUC data such as ground points. 

In addition to this generic list of tools, some metrics are 
specifically used for validating the hard LUC maps obtained 
from LUCCM exercises. Allocation distance error (see 
Sect. 3 in chapter "Advanced Pattern Analysis to Validate 
Land Use Cover Maps") (Paegelow et al. 2014) measures the 
relevance of simulation errors by computing the distance 
between a false positive (commission) and the closest object 
in the reference map, considering the minimum distance or 
the centroids of the area in question. 


5.2.2 LUC Maps Series/LUC Changes 

The most appropriate, most complete validation procedure 
for hard maps must include three different maps: the simu- 
lation (T), a reference LUC map for the same year (tı) and 
the base map over which the simulation is executed (to). In 
other words, if our modelling exercise starts in the year 
2010, we will need a base map for 2010 to establish the 
initial landscape on which the simulation will be calculated. 
Then, if we run a simulation for the year 2020, we will also 
need a reference map for 2020 in order to be able to com- 
pare how well our model simulates change. By comparing 
the simulation and the reference map we can understand to 
what extent the simulation matches the reference data. The 
changes that take place on the reference map and the sim- 
ulation can be extracted by comparing them with the base 
map. The changes extracted from the two maps can then be 
compared so as to find out how well the simulated changes 
agree with the changes that took place on the reference 
maps. 

There are many tools for validating and understanding the 
errors and uncertainties of simulated changes. In fact, all the 
methods and strategies explained in Sect. 4 can be applied in 
LUCC modelling. In this case, however, the main purpose is 
to achieve the best possible fit between the results of the 
model and the reference data. 

The majority of metrics are obtained from the 
cross-tabulation matrix (see Sect. 1 in chapter “Basic and 
Multiple-Resolution Cross Tabulation to Validate Land Use 
Cover Maps"). The cross-tabulation matrix offers a detailed 
picture of the changes that were simulated (by 
cross-tabulating the simulation with the base map), the 
changes we used as a reference (by cross-tabulating the 
reference map with the base map) and the agreement and 
disagreement between the simulation and the reference map 
(by cross-tabulating the simulation with the reference map). 
The cross-tabulation matrix can also be used to summarize 
simulated and reference change in a series covering the main 
processes of change (artificialization, deforestation...). This 
enables us to quickly identify the changes that have taken 
place in our simulation and to spot potential change patterns 
that do not make sense. 
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Cross tabulation can be carried out at multiple resolu- 
tions (see Sect. 2 in chapter “Basic and Multiple-Resolution 
Cross Tabulation to Validate Land Use Cover Maps") (the 
original and coarser ones), to find out at which resolution 
there is the greatest agreement. Sometimes, the simulation 
and the reference landscape do not agree on the details but 
show high consistency at coarser scales. This implies that the 
model is unable to simulate the precise location of the 
changes, but it does simulate the main patterns of change 
correctly. 

Different metrics have been proposed for summarizing 
the agreement between the simulation and the reference 
maps that the cross-tabulation matrix shows in raw (see 
chapter “Metrics Based on a Cross-Tabulation Matrix to 
Validate Land Use Cover Maps"). The Areal and spatial 
agreement metrics (see Sect. 2 in chapter “Metrics Based 
on a Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") could be applied to summarize the agreement 
between two maps of changes, the simulated and the refer- 
ence change maps, overall or per category. Kappa (see 
Sect. 3 in chapter “Metrics Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps") also summarizes 
the overall agreement between two maps. However, it has 
been widely criticized because it assesses the similarity 
between the simulation and the reference map, but does not 
distinguish between the areas that change between the two 
dates and those that do not. Therefore, in maps that simulate 
permanence correctly, the Kappa metric will be high. 
Accordingly, we only recommend Kappa for assessing how 
well permanence is simulated, and it should not be used for a 
detailed assessment of the accuracy of simulated changes. 
The Kappa Simulation proposed by Van Vliet et al. (2011) 
takes the standard Kappa flaws regarding LUCC modelling 
into account. It focuses on the agreement between the 
changes in the simulation and the changes in the reference 
map with regard to the initial map used as a base for the 
simulation. 

The Agreement between maps at global and stratum 
level (see Sect. 4 in chapter "Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps") analysis can assess for a specific LUC transition, for 
example, whether the agreement between an observed (ref- 
erence map) and a simulated transition varies or not for 
several distance classes resulting from a driver (e.g. distance 
to roads). Other metrics, such as change statistics (see 
Sect. 1 in chapter "Metrics Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps"), are widely used 
for characterizing the simulated changes, providing extra 
information that may be helpful for their validation. 

Pontius proposes several metrics for validating simulated 
change (see chapter "Pontius Jr. Methods Based on a Cross 
Tabulation Matrix to Validate Land Use Cover Maps"). 
Some of them can also be used to validate time series of 
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LUC maps and were therefore described in Sect. 2. The 
LUCC budget (see Sect. 2 in chapter “Pontius Jr. Methods 
Based on a Cross Tabulation Matrix to Validate Land Use 
Cover Maps") technique helps users to understand the 
changes that take place between the simulation and the base 
map and between the reference and the base maps. This tool 
calculates the gross and net changes, overall and per cate- 
gory, as well as the category swaps, in both the simulated 
and the reference landscapes. This enables us to assess in 
detail whether the changes we simulated are similar to the 
changes that take place on the reference maps and follow the 
same trends. 

Quantity & allocation disagreement (see Sect. 3 in 
chapter “Pontius Jr. Methods Based on a Cross Tabulation 
Matrix to Validate Land Use Cover Maps") differentiates, at 
an overall level and per category, between the (dis)agree- 
ment between two maps in terms of the proportion of the 
map occupied by each category (quantities) and the (dis) 
agreement due to the allocation of the categories in the 
same/different places on the map (allocation). It is therefore 
useful for assessing how much of the disagreement is due to 
the way the model simulates quantities and how much is due 
to its incorrect allocation of categories. By making the 
analysis at the category level, it also allows us to assess 
where (i.e. in which categories) the errors and uncertainties 
arise. 

If a chronological series of simulations (more than 
two-time points) is available, Incidents and States (see 
Sect. 5 in chapter "Pontius Jr. Methods Based on a Cross 
Tabulation Matrix to Validate Land Use Cover Maps" may 
also be employed. This metric helps identify pixels that 
follow illogical transition patterns, with changes at succes- 
sive time intervals between the same pair of categories (e.g. 
from agricultural to urban fabric and then back to 
agricultural). 

Intensity analysis (see Sect. 6 in chapter “Pontius Jr. 
Methods Based on a Cross Tabulation Matrix to Validate 
Land Use Cover Maps") compares the different intensities of 
change per category in simulations and reference maps over 
at least three points in time. In this way we can assess 
whether our model correctly simulated the change trend 
displayed by the reference data. The flow matrix (see 
Sect. 7 in chapter "Pontius Jr. Methods Based on a Cross 
Tabulation Matrix to Validate Land Use Cover Maps") 
could also be applied to validate simulated changes in a 
generic Way, assessing the stability and instability of the real 
and simulated changes over time. 

The Null model (Pontius and Malanson 2005) (see 
Sect. 1 in chapter “Pontius Jr. Methods Based on a Cross 
Tabulation Matrix to Validate Land Use Cover Maps") 
compares the agreement between the base map for the 
simulation and the reference map versus the agreement 
between the simulation and the reference map. If the former 
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is higher than the latter, our modelling exercise could be 
judged to have performed poorly, in that the accuracy of the 
obtained simulation is lower than that for a reference map in 
which no change takes place. This assertion may be clarified 
by using other validation tools to obtain a clearer under- 
standing of the logic and pattern of the simulated change. 
The null model is also a valuable tool for evaluating how 
well the model simulates permanence. 

The Figure of Merit (Pontius et al. 2008) and comple- 
mentary Producer’s and User's accuracy, (see Sect. 4 in 
chapter “Pontius Jr. Methods Based on a Cross Tabulation 
Matrix to Validate Land Use Cover Maps") also measure the 
agreement between simulated changes and changes in the 
reference map. The Figure of Merit technique is recom- 
mended when trying to assess the model's ability to correctly 
simulate change. The different components of the Figure of 
Merit can be used to discover whether the model estimates 
more or less change than the reference map. It is also highly 
recommended for evaluating the congruence of model out- 
puts and model robustness. This is a form of validation that 
evaluates the agreement between simulations obtained using 
different models or using the same model parametrized in 
different ways (Paegelow et al. 2014; Camacho Olmedo 
et al. 2015). 

None of the above tools assesses the accuracy of the 
pattern of LUC change in the simulation. This aspect is 
important because even if the quantities simulated are wrong 
and the categories are not allocated in the same positions as 
in the reference maps, the pattern of LUC change may have 
been simulated correctly. Pattern can be validated using 
Spatial metrics (see Sect. 1 in chapter "Spatial Metrics to 
Validate Land Use Cover Maps") and the Map Curves (see 
Sect. 1 in chapter "Advanced Pattern Analysis to Validate 
Land Use Cover Maps") method, which compare the pattern 
of the simulation with the pattern of the reference landscape. 

Spatial metrics characterize many different elements of 
the landscape: fragmentation, shape complexity, category 
proportions, diversity.... They can be calculated specifically 
for the simulated and reference changes, so allowing users to 
identify the specific pattern characteristics of the features 
that changed during the simulation period. In this way we 
can understand the size and shape of the simulated changes, 
inferring from this information how logical or uncertain they 
may be. 

The MapCurves method gives a summary figure for the 
pattern agreement between two maps, and is therefore much 
easier to interpret. However, it does not provide all the 
complex detail that can be revealed by applying the different 
spatial metrics. 

We can also analyse the changes that take place on the 
borders of existing patches and the changes that result in the 
appearance of new patches. This distinction may be useful 
for identifying errors or inconsistencies. The detection of 
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change on pattern borders (see Sect. 2 in chapter 
"Advanced Pattern Analysis to Validate Land Use Cover 
Maps") enables us to evaluate and identify errors in the 
simulations, which may be due to different parameters being 
applied in the model allocation procedure, such as, for 
example, the use of a contiguity filter. The Allocation dis- 
tance error (see Sect. 3 in chapter “Advanced Pattern 
Analysis to Validate Land Use Cover Maps") calculates the 
distance between wrongly simulated patches and reference 
patches, so as to gain a better picture of how well the patches 
are simulated. In this sense, a model that wrongly allocates 
change close to areas that actually change on the ground 
would be considered to have performed better than a model 
that allocates them further away. 
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Abstract 


This chapter presents a review of Land Use Cover 
(LUC) datasets at global and supranational scales. To this 
end, we differentiate between LUC maps (Sect. 3) and 
reference LUC datasets (Sect. 4). The former map how 
different land uses or covers are distributed across the 
Earth's surface. The latter provides a sample of LUC data 
for specific points on Earth and are normally used in LUC 
mapping and modelling calibration and validation exer- 
cises. We also include a brief presentation of the main 
producers of LUC datasets (Sect. 2). The LUC maps 
reviewed here are classified according to different criteria. 
First, we differentiate between general LUC maps 
(Sect. 3.2), which provide information about all land 
uses and covers on Earth, and thematic LUC maps 
(Sect. 3.3), which focus on the mapping of a specific land 
use or cover. Second, we classify general and thematic 
LUC maps according to their extent, distinguishing 
between global and supra-national LUC maps. The 
general maps are classified according to the continent 
for which they provide information, either fully or 
partially, while the thematic maps are classified according 
to the type of land use or cover they focus on. Most of the 
datasets reviewed in this chapter are characterized in 
detail in Part IV of this book, to which this chapter acts as 
an introduction. This chapter includes a series of tables 
with all the datasets, indicating those for which a detailed 
description is provided in Part IV. 
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1 Introduction 


Nowadays, there are many sources of Land Use Cover 
(LUC) data. The availability of LUC data has been 
increasing since the end of the last century, in line with the 
development of remote sensing techniques and easier access 
to aerial and satellite imagery. LUC data is available at all 
spatial scales, from local to global. Access to spatial infor- 
mation, including LUC datasets, has also improved in the 
last decade with the development of the open access culture. 

Most of the LUC data being produced today refers to LUC 
maps, which are either single, one-off maps or form part of a 
time series. These maps provide layers of spatial data with 
LUC information for each part of the area being mapped at one 
(single maps) or several points in time (series of maps). Other 
spatial sources of LUC information include reference datasets 
used to validate LUC maps or train remote sensing classifiers. 
Although datasets of this kind have been produced since the 
beginning of the satellite remote sensing era, they have only 
recently become widely available for general purposes. 

In this chapter, we review the main producers of LUC 
maps and the most relevant LUC datasets currently available 
—both LUC maps and data packages with reference data. 
Although this aspires to be a comprehensive review, some 
LUC products may be missing. We focus on the datasets that 
are available for download and can be used in practice. 
When relevant, we also mention others that are currently 
unavailable for download. 

Many older LUC maps are not included, because they 
were drawn at very coarse resolution using old-fashioned 
production methods and therefore cannot meet the demands 
of modern users. Because of the scope and extent of the 
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book, we focus exclusively on datasets at global and 
supra-national levels. A detailed description of the approach 
followed when carrying out this review appears in chapter 
"About This Book" of this book. 

The most important datasets reviewed in this chapter are 
described in detail in Part IV of this book (chapters “Global 
General Land Use Cover Datasets with a Single Date 
"—"Supra-national Thematic Land Use Cover Datasets”), 
where users can find a detailed description of each dataset, 
including classification schemes, production methods and 
download options. 


2 The Producers of LUC Data 


We have classified LUC data producers into four main 
groups (Fig. 1): (i) Individual users and small actors; 
(ii) Research projects; (iii) Governmental and other organi- 
zations; and (iv) citizens producing LUC information 
through Volunteering Geographic Information (VGI) initia- 
tives. The type of LUC data produced by each group varies. 
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At local and detailed scales, many organizations and 
users create their own LUC datasets. The fact that they have 
easy access to aerial/satellite imagery and to software for 
processing, photointerpreting and classifying these images 
has facilitated this process. This allows users to obtain very 
specific datasets that match their particular requirements. 
The datasets created for small projects and for specific 
purposes are not usually disseminated and remain the 
property of the communities or users that produce them. 
When these datasets are made available, they are often 
provided without the necessary technical information and 
general metadata. 

At regional, national, supra-national and global scales, an 
increasing number of LUC databases are being produced for 
à broad range of users. Often these databases are specially 
designed for specific communities, such as the climate 
change research community. In other cases, they provide 
more general LUC information for a wide range of research 
fields and as support for policy decisions. 

There are two main producers of LUC datasets. Firstly, 
nationally or internationally funded research projects, which 


1. Individual users and small actors 


Datasets for specific areas and points in time 


Fund constraints that limit data sources and production methods 


Usually created for very specific purposes 


Tendto remain in closed communities, without open dissemination 


2. Research projects (national and international) 


Datasets producedin specifictimeframes, usually become outdated 
Tendto address the needs of specific research communities co 


Data is not always available for download 


3. Public governments organizations 


Mapping commitmentthat usually lasts in time 


Contributions from citizens 


Land Use Cover (LUC) data 


Producers of 


Fig. 1 Classification and characterization of LUC data producers 


Well funded and, therefore, more ambitious projects 


Aimed at a wide community of users, either from academia, 7 
policymaking or the private sector 


Datasets are usually open and well characterized 


4. Volunteered Geographic Information (VGI) 


Itis not usually enough to produce independent LUC data, but usually 
feeds other mapping experiences led by public bodies or funded 
through different research projects 
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produce the datasets in collaboration with different univer- 
sities and research institutions. The limited timeframe of 
these projects often affects the continuity of the mapping 
work they perform, and the datasets are not usually 
improved or updated once the project has come to an end. 
Dissemination of the data may also be affected by the end of 
funding. The Global Land Cover Facility, a reference ini- 
tiative in the field of LUC research, which recently went 
offline,’ is a perfect example of this problem. 

Depending on the specific objectives of the projects and 
the institutions involved, these datasets may or may not be 
available for download. The quality of metadata and auxil- 
iary information can also vary a lot from one project to the 
next. In some cases, a lot of technical and auxiliary infor- 
mation is provided, while in others users can only access the 
dataset itself and the research paper in which it is presented. 

Governmental and other organizations are the other big 
producers of LUC data. In these cases, the objective is to 
provide information about the areas for which the organi- 
zation is responsible or the areas affected by its policies 
and/or decisions. This data is a useful source of information 
for the policymaking process and is usually part of wider 
cartographic efforts by national and regional governments, 
and sometimes by international organizations, to provide 
geographic information of reference. 

As these projects are part of official mapping work con- 
ducted by nations, regions and other large organizations, 
they are usually backed by significant long-term funding. 
These databases are therefore more likely to be updated or 
improved in the future. Another advantage is that they 
usually provide highly detailed, accurate information. They 
are also quite flexible. As a result, these databases are widely 
used by the whole scientific community, public and private 
sector professionals and many other users. 

In recent years, there has been an increase in the data 
produced by members of the public through crowdsourcing 
or similar practices. This kind of information is known as 
Volunteered Geographic Information (VGI) and is part of a 
movement called ‘citizen science’, in which private citizens 
participate in scientific research, either by gathering or val- 
idating data or by assisting in any of the other phases of the 
scientific process. 

Approaches of this kind allow local knowledge and 
expertise to be incorporated into data production. Highly 
detailed, up-to-date datasets can be produced easily and 
cheaply. Nevertheless, important issues can arise in terms of 
data quality and uncertainty, due to possible inconsistencies 
in the methods and procedures followed by the contributors, 
their different levels of expertise, etc. 


! https://spatialreserves. wordpress.com/2019/01/07/global-land-cover- 
facility-goes-offline/. 
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3 Land Use Cover Maps 


Reviewing all the LUC maps currently available is a 
daunting task, which perhaps explains why it has rarely been 
attempted. To our knowledge, the only researchers to carry 
out an extensive review of LUC maps at global and regional 
scales were Grekousis et al. (2015). They focused on general 
LUC products synthetizing all the land uses and land covers 
on Earth, so overlooking the increasing trend towards the- 
matic LUC datasets that provide detailed mapping of a 
specific land use or land cover (e.g. forest, crop areas...). 

The dividing line between general and thematic LUC 
products is not always clear. Some LUC maps, for example, 
provide general information on several different land covers 
(e.g. artificial, vegetation, water) while providing a detailed 
study of just one of them, thereby adopting a thematic 
approach. Although, in our review, we classify LUC maps as 
either general or thematic, readers should be aware of these 
possible inconsistencies. 

Both types of LUC maps, general and thematic, can also 
be classified according to the extent they cover, differentiating 
between global, supranational, national, regional and local 
LUC maps. However, a comprehensive review of national, 
regional and local maps would be a huge task that is beyond 
the scope of this book. We will therefore be focusing exclu- 
sively on global and supranational LUC maps. 

LUC maps for national and, especially, for regional and 
local areas, are usually only available for developed coun- 
tries, or even highly developed countries, which can afford 
to invest in the production of spatial information and in 
research programmes. The most developed nations of the 
European Union, Australia and the United States usually 
have detailed LUC datasets, not only at a national level but 
also for specific regions. In China, the government has 
invested heavily in research, so enabling the production of 
national and regional LUC products. China is, together with 
the USA, the country producing most research on LUC 
mapping today (Yu et al. 2014). 


3.1 Platforms and Repositories 

A few online platforms and repositories provide an overview 
of the LUC datasets available. The Geo-Wiki platform 
(www.geo-wiki.org) is one of the most recent. It was initially 
developed to collect reference LUC information through 
crowdsourcing and to create a hybrid LUC map. It now hosts 
both general and thematic LUC maps. The Google Earth 
Engine Platform, which was also recently launched, includes 
a repository of spatial datasets, with a specific section 
devoted to Land Cover data (https://developers.google.com/ 
earth-engine/datasets/tags/landcover). 
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The FAO  Geonetwork repository  (www.fao.org/ 
geonetwork/) makes a great deal of spatial datasets available 
to users. The repository includes a specific section on LUC 
data. It hosts LUC maps at all scales and is a valuable source of 
LUC information for developing countries. The Land Pro- 
cesses Distributed Active Archive Center (LP DAAC) (https:// 
Ipdaac.usgs.gov/) holds most of the LUC datasets produced by 
NASA and the United States Geological Survey (USGS), in 
addition to other important global datasets. 

The Copernicus Land Monitoring System website 
(https://land.copernicus.eu/) is the main source of LUC 
products created through the Copernicus programme, and is 
of particular interest for those working with European LUC 
information. All Copernicus layers are also available through 
the WEKEO Copernicus DIAS service (https://wekeo.eu/), a 
cloud-based platform that provides access to Copernicus 
datasets and to various tools for processing them, including 
all the land monitoring data. 


3.2 General Land Use Cover Maps 


3.2.1 Global LUC Maps 

The production of global LUC datasets started at the end of 
the twentieth century. By then, coarse-resolution satellite 
imagery was available for producing consistent global LUC 
datasets at a low cost. A previous attempt had been made to 
create a global LUC map through photointerpretation of 
aerial imagery (Campbell 1983). Some authors also mention 
the maps developed by Matthews (1983), Olson et al. (1983) 
and Wilson and Henderson-Sellers (1985), when reviewing 
the first global LUC datasets. However, these datasets are 


Table 1 List of available global general LUC maps with a single date 


D. García-Álvarez and Sabina Florina Nanu 


quite thematic, focusing particularly on vegetation. They 
were created by combining existing maps with data obtained 
in the field and via interpretation of aerial imagery (Giri 
2005). 

The first global general LUC map of which we have 
record dates from 1994 (Table 1) (Defries and Townshend 
1994). It was a global LUC map obtained after classification 
of AVHRR imagery data at a very coarse resolution: one 
degree (z:111 km at the Equator). This project was led by 
the Laboratory for Global Remote Sensing of the University 
of Maryland. 

The next global LUC maps were also produced by the 
team from Maryland. These were an improvement on their 
original map. Two maps were produced at spatial resolutions 
of 8 km and 1 km, respectively (DeFries et al. 1995; Hansen 
et al. 2000). For years, they were distributed through the 
Global Land Cover Facility. However, since this repository 
went online, only the map at 1 km has been available. The 
other two maps are now outdated, both due to their very 
coarse resolution, of little use for most of today's applica- 
tions, and because of the methods employed in their 
production. 

A lot of new maps have been produced since these first 
global general LUC maps appeared, especially since 2010. 
Tables 1 and 2 provide a synthetic overview of these efforts. 
When available, the tables include a reference to the section 
of this book where these datasets are described in detail. For 
the datasets providing a time series of maps, we also specify 
to what extent LUC changes can be studied over the series of 
maps without important sources of uncertainty. 

As in the case of the pioneering maps from the University 
of Maryland, all the datasets reviewed here have been 


LUC map Spatial Timeframe Number of Description note 
resolution classes 
Mathews Global ~1llkm 1983 32 - 
Vegetation/Land Use 
UMD LC Classification 1 km 1992/93 14 Sect. 1 in chapter “Global General Land Use Cover Datasets 
with a Single Date” 
GLCC 2.0 Global ] km 1992/93 17 (IGBP) Sect. 2 in chapter “Global General Land Use Cover Datasets 
with a Single Date" 
GLC2000 1 km 1999/2000 22 Sect. 3 in chapter “Global General Land Use Cover Datasets 
with a Single Date" 
GMRCA LULC 10 km 2000 10 - 
Geo-Wiki Hybrid 300 m 2000/05 10 Sect. 4 in chapter “Global General Land Use Cover Datasets 
with a Single Date" 
LADA LUC map ~8.3 km 2007 40 Sect. 5 in chapter “Global General Land Use Cover Datasets 
with a Single Date" 
GLC-SHARE 1 km 2014 and 11 Sect. 6 in chapter “Global General Land Use Cover Datasets 
before with a Single Date" 
OSM Landuse/Landcover 10m 2017 and 14 Sect. 7 in chapter “Global General Land Use Cover Datasets 
before with a Single Date" 
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Table 2 List of available global general LUC datasets with a time series of maps 


LUC map Spatial Timeframe Number 
resolution of classes 
GLASS-GLC 5 km 1982-2015 8 Yes 
LC-CCI 300 m 1992-2018 37 Yes 
GLC30 30 m 2000, 2010, 10 Yes 
2020 
GLC250 250m 2001, 2010 25 
MCD12Q1 500 m 2001-2020 18 
GLCNMO ] km 2003 (1 km) 20 No 
500 m 2008 (500 m) 
2013 (500 m) 
GlobCover 300 m 2005, 2009 23 No 
FROM-GLC 30 m 2010 (30 m) 11 
10 m 2015 (30 m) 
2017 (30, 10 m) 
CGLS-LC100 100 m 2015-2019 23 Yes 


developed by research groups from different universities 
across the world, above all from China, Europe and the 
USA. The Joint Research Centre (JRC) of the European 
Commission and the USGS of USA have also been actively 
involved in many of these projects. 

Most of these datasets are intended for use in climate 
change modelling, for which coherent global LUC maps at 
coarse resolutions are required. However, these databases 
are becoming increasingly popular and are used for many 
other purposes, a lot of them related with land change. This 
has been one of the drivers promoting the creation of new 
maps, with better quality and higher detail. 

Below, we characterize the global LUC datasets produced 
in the last decades according to their method of production, 
level of accuracy and spatial, temporal and thematic reso- 
lutions. Over this period, map production methods have 
becoming increasingly complex in order to create more 
accurate maps that provide better spatial, temporal and the- 
matic information. 


The Production Methods 

Nowadays, global LUC maps are created using improved 
and innovative production methods, involving advanced 
classifiers, such as those based on machine learning, as well 
as a lot of auxiliary data. In many cases, specific LUC cat- 
egories are mapped through several specific procedures due 
to their particular patterns, reflectance behaviour, etc. 
Additional post-classification treatments have also become 


Does it support 
change detection? 


Not recommended 


Not recommended 


Not recommended 


Description note 


Sect. 1 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 2 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 3 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 4 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 5 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 6 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 7 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 8 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


Sect. 9 in chapter “Global General Land Use Cover 
Datasets with a Time Series of Maps” 


common in a bid to avoid some of the uncertainties and 
errors associated with the production of these maps. 

In recent years, due to the increasing availability of LUC 
datasets, more and more global LUC maps are being pro- 
duced by data fusion, in which new maps are created by 
combining existing datasets using a range of different 
algorithms and approaches. The aim of these projects is to 
create datasets with higher levels of accuracy and, therefore, 
less uncertainty. To this end, they usually combine the most 
accurate or highest quality LUC information from each 
dataset. 

FAO-GLCShare is perhaps the best-known example of an 
attempt to build a new global LUC map from data fusion. It 
was created in 2014 by merging high-quality detailed 
national and regional LUC databases (Latham et al. 2014). 
In many cases, the new maps were obtained from the fusion 
of existing LUC datasets at global scales. Geo-Wiki Hybrid 
(See et al. 2015) is one of the most famous examples of 
maps created using this approach. 

LUC maps obtained from data fusion do not have a single 
specific date of reference for the mapped area. When first 
produced, they are considered as up-to-date LUC databases. 
However, if they are not updated frequently, they eventually 
become obsolete and can no longer be regarded as useful 
sources for LUC change analysis. 

The maps obtained through crowdsourcing, ie. by 
aggregating a large number of individual inputs supplied by 
a community of people, could undergo the same problems. 
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Although still relatively rare, they could play an important 
role in the future. OSM-LULC, released in 2017 (Schultz 
et al. 2017), is the only example of a global general LUC 
map made with crowdsourced data. 

These projects are usually updated on a regular basis. 
However, problems of coverage arise. In OSM-LULC, most 
of the world (except for specific test areas in Europe) is only 
partially mapped. Moreover, as they rely on volunteers to 
provide the information they require, the mapping and 
updating work is dependent on the volunteers’ availability 
and willingness to participate. These may vary greatly from 
one country to the next and also over time. This is an 
inevitable source of uncertainty. 

The recent advent of the Google Earth Engine 
(GEE) platform has encouraged the production of new glo- 
bal LUC maps, some general and others thematic. GEE 
provides a powerful cloud computing service, giving users 
the chance to process and classify tons of satellite imagery. 
This is particularly important when users do not have the 
necessary computer power to do this themselves. The 
availability of cloud-computing services will lead to an 
increase, in the near future, in the number of highly detailed 
LUC products being created using complex computer pro- 
duction methods. Many of these will be produced at global 
scales. 


Accuracy 

The development and application of new methods and 
techniques to produce LUC maps has not improved the 
accuracy of these datasets. Although some global LUC maps 
are more accurate than others, there is no correlation 
between time, the introduction of new methods and tech- 
niques and the achievement of higher levels of accuracy (Yu 
et al. 2014). 

Global LUC datasets usually have accuracy levels of over 
6096. In the best cases, they are around 8096. They are 
therefore still subject to high degrees of uncertainty. This is 
to be expected given the high level of abstraction they 
require. The entire surface of the Earth is being mapped 
according to the same method and must fit into the same 
legend. This means there is little room for local or regional 
specificities, which inevitably introduces a degree of 
uncertainty. 


Spatial Resolution 

LUC mapping has evolved over time, with the result that 
global LUC maps are produced at an increasing number of 
spatial resolutions. Initially, the AVHRR and VEGETA- 
TION sensors, with a spatial resolution of 1 km, were the 
main source of imagery for global LUC mapping. Later, 
imagery from MODIS (500 m) and MERIS (300 m) became 
the standard source of information. In recent years, it has 
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become increasingly common to use the huge stock of 
Landsat imagery to produce global LUC maps at 30 m. 
Some projects have gone even further, producing global 
LUC maps at even finer resolutions. One example is the 
2017 edition of FROM-GLC (10 m) (Chen et al. 2019), 
which was based on Sentinel-2 imagery. 

Sentinel satellites will be providing free, long-term, 
high-quality imagery over the coming years. This may boost 
the production of global LUC maps at increasingly high 
levels of detail. 


Temporal Resolution 

The temporal resolution of LUC maps has also increased 
over time, especially in recent years. Historical time series of 
LUC maps are becoming more common (Table 2). When- 
MODIS Land Cover (MCD12Q1) was launched in 2002, it 
was the first global LUC dataset to provide a series of LUC 
maps for different years (Friedl et al. 2002). It was later 
joined by GLCNMO, GlobCover, FROM-GLC and GLC30, 
which all provided new series of LUC maps for at least two 
different points in time. 

However, in most of these series, LUC change cannot be 
reliably detected by cross-tabulating the different maps that 
make up the dataset. Different methods of production for 
each year, changes in the source of imagery, differences in 
the reflectance of the images, etc., introduce a lot of noise in 
the comparison. This makes it impossible to obtain mean- 
ingful results from LUC change analyses. 

The latest version of the MODIS Land Cover (Collection 
6) incorporated important changes in the product algorithm 
and workflow to account for these sources of uncertainty 
(Sulla-Menashe et al. 2019). However, change detection is 
still not supported and is therefore not recommended. 

New time series of LUC maps have been produced 
recently with the specific purpose of enabling change 
detection. These include the LC-CCI (ESA 2017) and 
GLASS-GLC maps (Liu et al. 2020). They provide a long 
record of LUC information: with yearly maps for the period 
1992-2018 in the case of the LC-CCI, and for the period 
1982-2015 in the case of GLASS-GLC. The latter dataset 
has the longest, most frequent time series currently available. 
However, it uses a very coarse spatial resolution (5 km) and 
change detection using the GLASS-GLC map series is 
limited by various sources of uncertainty (Liu et al. 2020). 


Classification Schemes 

Unlike the spatial and temporal resolutions, there are no 
important variations over time in the thematic resolution of 
most global LUC products. In fact, standard LUC classifi- 
cation systems are now widely used so as to ensure that the 
different databases are comparable. One of the most common 
is the International Geosphere-Biosphere Programme 
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(IGBP) legend, which was used in one of the first LUC 
global maps ever released: the IGBP-Dis. Maps based on the 
IGBP legend usually distinguish around 17 categories. 

The Land Cover Classification System (LCCS) proposed 
by the FAO in 1998 (Di Gregorio and Jansen 1998) has 
become the standard LUC classification method today. It is a 
flexible classification system that can be adapted to LUC 
maps at different scales and for different areas of the world. 
It first distinguishes between 8 broad land cover categories, 
each of which is later disaggregated into a varying number 
of subcategories based on a series of classifiers, which define 
the attributes or characteristics of each land cover. This 
enables users to adapt the classification detail to the required 
level of analysis. The resulting categories are mutually 
exclusive, as they are defined by different sets of classifiers. 
LCCS-based legends are hierarchical and comparable, so 
facilitating the comparison and analysis of global LUC maps 
by checking for agreements and differences. 


3.2.2 Supra-national LUC Maps 

A lot of international institutions and organizations need 
comprehensive and coherent worldwide data to support their 
activities. Global datasets are also required by research 
communities that study the whole Earth as a system. For 
their part, national governments and organizations require 
large amounts of data to support policymaking at a national 
level. Many other institutions, associations, professionals 
and researchers need very detailed data that is only available 
at regional and local scales. 

Within this context, supra-national datasets do not pro- 
vide much detail and work at a different scale to that at 
which most institutions and organizations implement their 
policies. They therefore do not meet the requirements of the 
research and policy-making communities working at global 
scales. This means that there is less interest and conse- 
quently less funding for datasets at these scales, hence the 
relative lack of supra-national LUC maps. 

Supra-national LUC maps have been developed by the 
European institutions to assist policymaking and environ- 
mental monitoring in Europe. In other continents, 
supra-national LUC maps are usually developed within the 
context of different projects funded by international institu- 
tions, such as the FAO and various different US and Euro- 
pean institutions. The latter include the European Space 
Agency (ESA) and the Joint Research Centre (JRC) of the 
European Commission, which have been actively involved 
in the production of supra-national LUC maps for many 
developing areas with important biodiversity values. 


Europe 
Europe is the continent with the widest range of 
supra-national LUC maps. The European Union (EU) has 
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certain powers over the European environment and is 
therefore interested in monitoring any changes in land use. 
To this end, the EU has invested in the production of 
EU-wide reference data as a reliable source of information 
on which to base their policy decisions. As a consequence, 
plenty of detailed, high-quality datasets are now available 
providing LUC information for the European continent 
(Table 3). The quality and detail of these datasets reveal the 
large amount of resources that the EU has invested in land 
monitoring, especially in recent years via the Copernicus 
programme. 

Of all the European LUC datasets, CORINE Land Cover 
(CLC) is by far the best known. It is one of the oldest and 
most successful programmes on land monitoring, offering 
very high levels of accuracy and detail. All these qualities 
have made CLC a reference in LUC mapping worldwide. It 
is the only cross-country initiative working at similar scales 
that provides detailed, temporally rich LUC data, which can 
be used effectively for change detection. CLC is one of the 
best examples of decentralized, coordinated LUC mapping. 
CLC is produced at a national level, which allows European 
countries to develop their own national datasets while taking 
advantage of the work and the resources invested to create 
CLC. 

A few non-European countries have mapped the land 
uses and covers in their entire nations or in certain specific 
areas following the CLC model. Some of them have done so 
with the help of the European institutions and other Euro- 
pean research groups. These include Palestine, Morocco, 
Tunisia, San Salvador, Guatemala, Honduras, Haiti, 
Dominican Republic, Colombia, Burkina Faso and Gabon 
(Jaffrain 2011). Nevertheless, these maps are one-off, 
single-date LUC maps which do not provide the monitoring 
capacity provided by CLC in Europe. 

Through the Copernicus programme, the EU has also 
developed coherent and consistent LUC mapping products 
aimed at monitoring the LUC dynamics of specific areas 
(e.g. coastal and metropolitan areas, riparian zones, Natura 
2000 network...). These are very detailed products in both 
spatial and thematic terms, which have been designed to 
meet the needs of their potential community of users or to 
provide information in support of a range of different poli- 
cies. Their production is centralized, so avoiding the 
inconsistencies that might result from a coordinated, 
decentralized production method. Although they were only 
recently launched, the EU has assured their long-term con- 
tinuity, so providing consistent time series of data. 

Two other series of LUC maps, which are complementary 
to CLC, are also available for Europe. Annual Land Cover is 
a recently launched product that provides annual LUC maps, 
so overcoming the temporal resolution limitations of CLC, 
which is only updated once every 6 years. Annual Land 
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Table 3 List of available general LUC datasets for Europe 


LUC map Extent Spatial Timeframe 
resolution/Scale 
HILDA Europe (EU) ] km 1900-2010 
(every 
10 years) 
CLC Europe (EU) 1:100,000 1990, 2000, 
MMU: 25 ha 2006, 2012, 
2018 
PELCOM Europe (EU) ] km 1997 
Annual Europe (EU) 30 m 2000-2019 
Land 
Cover 
GlobCorine Europe 300 m 2005, 2009 
Urban Functional urban 1:10,000 2006, 2012, 
Atlas areas of Europe MMU: 2018 
(EU) 0.25-1 ha 
N2K Natura 2000 1:5000- 2006, 2012, 
reserves of 1:10,000 2018 
Europe (EU) MMU: 0.5 ha 
Riparian Riparian areas of 1:10,000 2012, 2018 
Zones Europe (EU) MMU: 0.5 ha 
Coastal Coastal regions 1:10,000 2012, 2018 
Zones of Europe (EU) MMU: 0.5 ha 
S2GLC Europe (EU) 10 m 2017 
2017 


Cover is produced as part of a project funded by the Euro- 
pean Commission, which aims to create harmonized spatial 
datasets for Europe. However, it is not recommended for 
change detection, as there is a lot of inter-annual variability 
between LUC covers. 

HILDA is another LUC dataset providing a long time 
series of LUC maps for Europe. Although it has a coarser 
resolution, it provides the longest time series of maps 
reviewed here: 1900—2010. It was produced by a research 
project team, who combined various different datasets and 
applied complex modelling techniques (Fuchs et al. 
2013). 


Africa 

A large number of supra-national LUC maps have also been 
found for Africa (Table 4). Most of the datasets cover 
specific regions of the continent, such as Eastern, Western or 
Southern Africa. Areas that are particularly relevant for 
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Number Does it support | Description note 

of change 

classes detection? 

6 Yes Sect. 1 in chapter “Global General 
Land Use Cover Datasets for 
Europe” 
44 Yes, through Sect. 2 in chapter “Global General 
layer of Land Use Cover Datasets for 
changes Europe” 
16 One-date map Sect. 3 in chapter “Global General 
Land Use Cover Datasets for 
Europe” 

33 No Sect. 4 in chapter “Global General 
Land Use Cover Datasets for 
Europe” 

17 No Sect. 5 in chapter “Global General 
Land Use Cover Datasets for 
Europe” 
29 Yes, through Sect. 6 in chapter “Global General 
layer of Land Use Cover Datasets for 
changes Europe” 
11 Yes Sect. 7 in chapter *Global General 
Land Use Cover Datasets for 
Europe" 

56 One-date map Sect. 8 in chapter *Global General 
Land Use Cover Datasets for 
Europe" 

71 Yes, through Sect. 9 in chapter *Global General 
layer of Land Use Cover Datasets for 
changes Europe" 

13 One-date map Sect. 10 in chapter “Global 


General Land Use Cover Datasets 
for Europe” 


environmental research, such as the Congo Basin, have also 
been mapped. 

Only a few projects tried to offer an overview of the LUC 
covers for the entire African continent. The FAO mapped the 
covers for many African countries as part of the AFRI- 
COVER project, but did not encompass the whole continent. 
The first comprehensive, Africa-specific, general LUC 
dataset only appeared quite recently. It was produced by EU 
research and earth-observation organizations. No similar 
initiatives have been found for America, Asia and Oceania. 
They are also quite rare for Europe as a whole, where con- 
tinental LUC data usually covers the EU and associated 
countries. 

There are three datasets providing a time series of LUC 
maps for different African countries. However, only one of 
these (West Africa Land Use Land Cover) was obtained by 
applying a common mapping approach which provides LUC 
information for all mapped areas at the same dates. In the 
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Table 4 List of available general LUC datasets for Africa 
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LUC map Extent Spatial Timeframe Number | Does it Description note 
resolution/Scale of support 
classes change 
detection? 
West Africa Land | West Africa 2 km 1975, 2000, 2013 26 Yes Sect. 1 in chapter “General 
Use Land Cover Land Use Cover Datasets for 
Africa" 
SERVIR-ESA Eastern and 30m Different dates 7 Yes Sect. 2 in chapter “General 
Southern Africa depending on the Land Use Cover Datasets for 
country Africa” 
(1990-2015) 
SADC Land Southern African 1:250,000 Different dates 13 One-date map | Sect. 3 in chapter “General 
Cover Database Development depending on the Land Use Cover Datasets for 
Community country (1990/99) Africa” 
AFRICOVER Burundi, DR 1:200,000 Different dates 8 One-date map | Sect. 4 in chapter “General 
Congo, Egypt, depending on the Land Use Cover Datasets for 
Eritrea, Kenya, country (1994/01) Africa" 
Rwuanda, Sudan, 
Tanzania, Uganda, 
Lybia, Malawi 
CCI LAND Africa 20m 2016 10 One-date map | Sect. 5 in chapter “General 
COVER—S2 Land Use Cover Datasets for 
PROTOTYPE Africa" 
Congo Basin Congo Basin 300 m 2000/07 20 One-date map | Sect. 6 in chapter “General 


Vegetation Types | region 


other two, the time series is made up of national or regional 
LUC maps produced for different years of reference, so 
hampering cross-country LUC change analyses. 


The Americas 

In the Americas, there is a clear distinction between the 
datasets covering North America and those covering South 
America and the Caribbean (Table 5). For North America, 
the North American Land Change Monitoring System 
(NALCMS) is of particular interest. It provides LUC maps 
for Canada, Mexico and the USA at three points in time. It is 
the only LUC supra-national American dataset with a time 
series of LUC maps. The NALCMS maps are created by 
merging datasets produced individually for each participat- 
ing country following a similar approach. 

Three different maps have been produced for South 
America, including in some cases the Caribbean. These were 
the result of various different research projects and activities 
and two of them (SERENA and South America 30 m) are no 
longer accessible for use. 

South America 30 m, developed by Giri and Long 
(2014), provides the most up-to-date, detailed data. 
The SERENA map was designed to ensure its consistency 
with the NALCMS map (Blanco et al. 2013) so that together 
they could offer an overview of both North and South 
America. However, they had different spatial resolutions and 
were produced for different years of reference. 


Land Use Cover Datasets for 
Africa" 


Asia and Antarctica 

We only found one supra-national dataset for Asia, which 
covered the LUC of the Himalayan region (Table 6). It is 
possible that other supra-national datasets are available, 
although language barriers would prevent us from reviewing 
them properly. In any case, China is the most advanced 
country in Asia in terms of LUC mapping, and its research is 
focused above all on global and national mapping projects. 

No supra-national maps are available for Oceania, due to 
its particular characteristics in which continental areas and 
islands are usually separate individual nations. These coun- 
tries have no shared continental or inland regions for which a 
supra-national LUC dataset might be useful. As a result, no 
datasets of this kind have been produced. 

Finally, a specific LUC map for Antarctica was produced 
recently by Chinese researchers (Hui et al. 2017). It is a 
vector LUC dataset for the reference year 2000, which dif- 
ferentiates between three land cover types. It is available 
online for any interested user.” 


 https://zenodo.org/record/826032. 
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Table 5 List of available general LUC datasets for America 
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LUC map Extent Spatial Timeframe | Number of | Does it support Description note 
resolution categories change detection? 
LBA-ECO South 1 km 1987/91 41 One-date map Sect. 1 in chapter “General Land Use 
LC-08 America Cover Datasets for America and Asia” 
NALCMS North 30 m 2005 19 Partially Sect. 2 in chapter “General Land Use 
America 250m (250 m) Cover Datasets for America and Asia" 
2010 (250, 
30 m) 
2015 
(30 m) 
SERENA South 500 m 2008 22 One-date map = 
America 
MERISAM2009 South 300 m 2008/10 11 One-date map Sect. 3 in chapter “General Land Use 
America Cover Datasets for America and Asia” 
South America South 30 m 2010 5 One-date map = 
30m America 


Table 6 List of available general LUC datasets for Asia and Antarctica 


LUC map Scale Timeframe | Number of 
categories 

The Himalaya Regional 1:350,000 | 2000 35 

Land Cover database 

AntarcticaLC2000 1:100,000 2000 3 


3.3 Thematic Land Use Cover Datasets 


Thematic Land Use Cover (LUC) datasets map parts of the 
Earth’s surface as a specific land cover, considering not just 
its extent but also its intensity of distribution. They normally 
focus on land covers and provide very little information 
about land use. Thematic LUC maps are usually produced 
using automatic remote sensing techniques that find accurate 
land use characterization difficult. 

Thematic LUC maps usually represent land covers in 
greater detail than general LUC maps. Some provide infor- 
mation about the proportion of the study area occupied by a 
particular land cover on the ground. In other cases, they 
delineate the extent of a specific cover with great detail and 
accuracy. Other thematic LUC maps share certain features 
with general LUC maps, in that they map the Earth 
according to a set of predefined categories, which are usually 
subclasses of a specific type of cover (e.g. vegetation). Many 
maps charting vegetation in its various different forms can 
therefore be regarded as thematic sources of LUC informa- 
tion in that they characterize a specific cover. 

Some maps may provide thematic information about 
specific land covers together with other relevant data. This 
was especially true in the twentieth century, when many 
different maps combining biogeographic and climate 


Does it support change 
detection? 


Description note 


Yes, through layer of 
changes (1970/80—2007) 


One-date map - 


Sect. 4 in chapter “General Land Use 
Cover Datasets for America and Asia” 


information were produced for the climate and other 
research communities. These maps were usually produced 
by merging different techniques and datasets. Examples 
include the maps produced by Matthews (1983) and Olson 
et al. (1983). As these maps are now outdated and were not 
focused exclusively on land cover, we decided not to include 
them in this review. 

Prior to the advent of satellite remote sensing, there were 
also a large number of traditional maps obtained through 
photointerpretation of aerial imagery and field surveys that 
provided information on certain specific land covers. These 
maps charted vegetation above all and, to a lesser extent, 
agricultural areas. These can be useful sources of informa- 
tion for historical LUC change analysis. However, as they 
are usually only available for national or more detailed areas 
and in many cases have not been digitalized, they are not 
reviewed here either. 

There are also plenty of other spatial datasets that provide 
useful information for studying specific land covers. One 
example for vegetation covers are maps of live biomass 
(Kindermann et al. 2008; Thurner et al. 2014). Accordingly, 
there is a huge supply of information that can be used to study 
and characterize land covers, which comes in datasets of many 
different kinds. In this review, however, we will only be ana- 
lysing datasets with a pure land cover approach. 


Land Use Cover Datasets: A Review 


The fact that thematic LUC maps focus on a single, 
specific cover normally means they are more accurate than 
general LUC maps. They are often more detailed too. This 
makes them especially useful for uncertainty analysis and 
validation exercises. As a general rule, they are a good 
source of reference data for studying land covers in a par- 
ticular study area. However, they may not be as easy to use 
or to process as general LUC maps. If they provide too much 
information, users will have to process it to meet the specific 
needs of their studies. 

The progress made in recent decades in the production of 
general LUC maps has also been achieved in thematic LUC 
mapping, with increasing levels of detail and more innova- 
tive, more complex methods. Some of the newest products 
have been produced using the cloud-computing capabilities 
of Google Earth Engine, which seems likely to play a key 
role in thematic LUC mapping in the future, and will allow 
more thematic datasets to be produced. Until now, the 
Landsat archive has been the most detailed source of ima- 
gery for LUC thematic mapping, although the imagery 
provided by the Sentinel constellation of satellites will soon 
enable users to expand the catalogue of thematic LUC 
datasets at highly detailed spatial resolutions of less than 
30 m. 
3.3.1 Global Thematic LUC Maps Focusing 
on Vegetation Covers 
One of the most common features mapped by thematic LUC 
products is natural vegetation and tree and forest covers in 
particular. In fact, forest monitoring is one of the main 
applications of Landsat data, as reviewed by Hansen and 
Loveland (2012). This is because of widespread scientific 
interest in the study of vegetation dynamics and the fact that 
remote sensing techniques have made it much easier to 
characterize vegetation covers. 

LUC maps focusing on vegetation covers usually offer 
coherent time series of LUC data that support change 
detection (Table 7). The most popular include the Vegetation 
Continuous Fields (VCF) datasets produced by NASA. 
These were first produced at the beginning of the 2000s and 
were obtained from AVHRR data at 1 km (Hansen et al. 
2017). Since then, more VCF datasets have been produced at 
increasing levels of spatial detail, based above all on ima- 
gery from MODIS and Landsat (Hansen et al. 2003; Sexton 
et al. 2013). The temporal resolution of these products has 
also improved, with FCover providing information every 
10 days for the period 1999-2020. 

VCF datasets provide information about the vegetation 
cover fraction for each pixel in the analysed area. FCover is 
the only dataset that provides information on the percentage 
of vegetation cover, whereas all the others focus on tree or 
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forest covers. Whereas FCover considers all kinds of natural 
vegetation, MEaSUREs VCF (VCFSKYR), MODIS VCF 
(MOD44B), Landsat VCF (GFCC) and the Hansen Forest 
Map focus exclusively on tree covers. In addition, GFCC and 
Hansen Forest Map include specific layers of forest change. 
Forests are mapped as such when a minimum fraction of their 
area is covered by trees. Therefore, changes in tree cover 
changes do not necessarily mean forest changes. 

Two recent projects have explored the potential of radar 
data for mapping forest extent (Shimada et al. 2014; Martone 
et al. 2018). One of the advantages of radar data compared to 
optical sensors is that it is unaffected by weather and daylight 
conditions. This is particularly useful when mapping certain 
specific forest areas, such as those located in the tropics. 


3.3.2 Global Thematic LUC Maps Focusing 
on Agricultural Covers 

Agricultural areas are also widely mapped with specific LUC 
products (Table 8). Thematic agricultural LUC datasets 
usually show the extent of croplands and pasturelands or the 
cover fraction per unit of analysis, i.e. per pixel. In some 
cases, very detailed information on different types of crops is 
provided. These detailed LUC datasets are obtained from a 
wealth of detailed auxiliary information, as it is very difficult 
to accurately differentiate crop covers using standard remote 
sensing techniques. 

Unlike other LUC thematic products, those mapping 
agricultural areas do not usually offer a time series, which 
means they cannot be used for land change analysis. Map- 
ping agricultural areas is quite complex and this has hindered 
the production of coherent time series of agricultural LUC 
maps. One exception to this general trend was the dataset by 
Ramankutty and Foley (1999), who used historical sources 
of LUC data to model cropland cover on Earth from 1992 
back to 1700. Another exception was the Harvested Area 
and Yield for 4 Crops maps, which provided information for 
three different dates. 


3.3.3 Global Thematic LUC Maps Focusing 
on Artificial Covers 

Built-up areas are becoming a common subject for thematic 
LUC products. As with the datasets focusing on vegetation 
covers, they provide time series of data which support 
change detection (Table 9). However, many of these maps 
are binary maps that only differentiate between 
urban/impervious and non-urban/non-impervious surfaces. 
They do not provide information about specific land uses so 
limiting their utility. However, people working with artificial 
surfaces are more interested in land use than in land cover, as 
artificial areas can be used for many different purposes, each 
of which has a different impact on the Earth. 
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Table 7 List of thematic LUC datasets characterizing vegetation covers 


LUC map 


VCFSKYR 


The World's 
Forests 2000 


Global mangrove 
distribution 


FCover 


Hybrid Forest 
Mask 2000 


SYNMAP 


GFCC 


Hansen Forest 
Map 


MOD44B 


PTC Global 
version 


FNF 


Forests of the 
World 2010 


TanDEM-X 
Forest/Non-Forest 
Map 


Spatial 
resolution 


&25.6 km 


] km 


30m 


300 m 


] km 


] km 


] km 


30m 


30m 


250m 


500 m 


] km 


25m 


250m 


50 m 


Thematic 
information 


Percentage of tree 
cover, non-tree 
vegetation cover and 
bare ground 


3 forest classes 


Mangrove extent 


Percentage of 
vegetation cover 


Percentage of forest 
cover 


26 vegetation classes 


Percentage of tree 
cover and forest 
gains/losses 


Percentage of tree 
cover and forest 
gains/losses 


Percentage of tree 


cover 


Percentage of tree 
cover 


Forest extent 


Percentage of tree 
cover 


Forest/Non forest 


Timeframe 


1982-1993 
2001-2016 


1995/96 


1997/00 


Every 10 days from: 
1999-2020 (1 km) and 


2014 to the present 
(300 m) 


2000 


2000 


2000, 2005, 2010, 
2015 (tree cover) 


1990—2000/2000—2005 


(forest change) 
2000-2019 


2000-2019 


2003 (1 km) 
2008 (500 m) 


2007-2010 
2015-2017 


2010 


2011/16 
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Does it support 
change 
detection? 


Possible 


One-date map 


One-date map 


Yes, through 
specific layers of 
change 


One-date map 


One-date map 


Yes 


Yes, through 
specific layers of 
forest gains and 
losses 


Yes 


Possible 


Possible 


One-date map 


One-date map 


Description note 


Sect. 1 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 2 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 3 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 4 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 5 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 6 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 7 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 8 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 9 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 10 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 


Sect. 11 in chapter “Global 
Thematic Land Use Cover 
Datasets Characterizing 
Vegetation Covers" 
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Table 8 List of thematic LUC datasets characterizing agricultural covers 


LUC map 


Historic Croplands 
Dataset 


1992 Croplands 
Dataset 


Harvested Area and 
Yield for 4 Crops 
(1995-2005) 


GMRCA 


GIAM 


Cropland and Pasture 


Area in 2000 


Harvested Area and 
Yield for 175 Crops 


Global Agricultural 
Lands 


Global Cropland 
Extent 


IIASA-IFPRI 
Cropland Map 


GRIPC 


FROM-GC 
GFSADIKCD 


GFSADIKCM 


Global Synergy 
Cropland Map 


UCL 


GFSAD30 


LADA Dominant 
Crops 


ASAP Land Cover 
Masks 


Spatial 
resolution 


0.5 
degrees 


10 km 
(5 min) 


10 km 
(5 min) 
10 km 
10 km 
10 km 


(5 min) 


10 km 
(5 min) 


10 km 


250m 


1 km 


500 m 


30m 
] km 


] km 


500 m 


250m 


30m 


8.3 km 


1 km 


Thematic 
information 


Cropland 
proportion 


Cropland 
proportion 


Map proportion for 
4 crops 


66 categories 
grouped into 9 
Rainfed cropland 


28 categories 
Irrigated cropland 


Cropland 
proportion 
Pastureland 
proportion 


Map proportion for 
175 crops 


Cropland 
proportion 
Pastureland 
proportion 


Cropland extent 


Percentage of 
cropland cover 


3 cropland classes 


Cropland extent 


8 cropland classes 


5 cropland classes 


Percentage of 
cropland cover 


Percentage of 
cropland cover 


Cropland extent 


Up to 534 
categories 


Percentage of 
cropland/rangeland 
covers 


Temporal 
frame 


1700- 
1992 


1992 


1995 
2000 
2005 


2000 


2000 


2000 


2000 


2000 


2000/08 


2005 


2005 


2010 
2010 


2010 


2010 


2014 


2015 


Data 
fusion 


2019 


Does it 
support 
change 
detection? 
Yes 


One-date map 


Not for 
assessments at 
the cell level 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


One-date map 


Description note 


Sect. 1 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 2 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 3 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 4 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 4 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 5 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 6 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 7 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 


Sect. 8 in chapter “Global Thematic Land U 
Cover Datasets Global Thematic Land Use 
Cover Datasets” 
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se 


se 


se 


se 


se 


se 


se 


se 


se 
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Table 9 List of thematic LUC datasets characterizing artificial covers 
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LUC map Spatial resolution Thematic Timeframe Does it Description note 
information support 
change 
detection? 
Global Urban 30m Artificial 1980, 1990, Yes Sect. 1 in *Global Thematic Land 
Land areas extent 1995, 2000, Use Cover Datasets Characterizing 
2005, 2010, 2015 Artificial Covers" 
GHSL 10m (2018) Built-up areas | 1975, 1990, Yes, except Sect. 2 in “Global Thematic Land 
20 m (2016) extent 2000, 2014, for the 2016 | Use Cover Datasets Characterizing 
30 m, 250 m, 1 km (1975-2014) | Percentage of | 2016, 2018 layer Artificial Covers" 
built-up areas 
(2014) 

GAIA 30m Artificial 1985-2018 Yes Sect. 3 in “Global Thematic Land 

areas extent Use Cover Datasets Characterizing 
Artificial Covers" 

GUB 30m Urban 1990, 1995, Yes Sect. 3 in “Global Thematic Land 

boundaries 2000, 2005, Use Cover Datasets Characterizing 
2010, 2015, 2018 Artificial Covers" 

Global Urban ] km Urban areas 1992, 1996, Yes Sect. 4 in *Global Thematic Land 

Expansion extent 2000, 2006, Use Cover Datasets Characterizing 

1992-2016 2010, 2016 Artificial Covers" 

ISA ] km Impervious 2000/01, 2010 Unknown Sect. 5 in *Global Thematic Land 
area density Use Cover Datasets Characterizing 

Artificial Covers" 
URB MAP 500 m Urban extent 2001/05 One-date — 
product 
HBASE 30 m Urban areas 2010 One-date Sect. 6 in “Global Thematic Land 
250 m extent product Use Cover Datasets Characterizing 
] km Artificial Covers" 

GMIS 30m Percentage of | 2010 One-date Sect. 6 in “Global Thematic Land 
impervious product Use Cover Datasets Characterizing 
areas Artificial Covers" 

GUF x12 m Built-up areas | 2011 One-date Sect. 7 in “Global Thematic Land 

x84 m extent product Use Cover Datasets Characterizing 

Artificial Covers” 
WSF 10 m, 100 m, 250 m, 500 m, 1 | Settlement 1985-2015, In the future | Sect. 8 in “Global Thematic Land 
km, 10 km areas extent 2014/15, 2019 Use Cover Datasets Characterizing 

Artificial Covers” 

GISM 30 m Impervious 2015 One-date Sect. 9 in “Global Thematic Land 
areas extent product Use Cover Datasets Characterizing 


3.3.4 Global Thematic LUC Maps Focusing 
on Water and Other Covers 

Some thematic LUC products focus specifically on water 
covers, two of which provide information on their change 
over time (Table 10). Other products offer a hybrid between 
general and thematic LUC datasets. These include the Glo- 
bal 1-km Consensus Land Cover, which provides a LUC 
thematic map for 12 different land covers (Tuanmu and Jetz 
2014). It has 12 layers, each of which contains information 
about the fraction of the pixel occupied by the cover being 
mapped. A thematic LUC dataset with a similar approach 


Artificial Covers” 


was obtained for 13 different covers as part of the ClimA- 
frica project for the period 1901-2017 (Churkina et al. 
2009). Like other similar datasets already reviewed, it was 
obtained by a model based on different sources of historical 
LUC information. 


3.3.5 Supra-national Thematic LUC Maps 

We have only reviewed a few experiences of supra-national 
thematic LUC mapping (Table 11). The majority of them 
map vegetation covers, focusing especially on areas of 
special biodiversity or environmental value. 
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Table 10 List of thematic LUC datasets characterizing water and other covers 


LUC map Spatial Thematic information Timeframe Does it support change 
resolution detection? 
Historical land use based on Synmap 0.5 degrees 13 themes (Map proportion 1901—2007 Yes 
landcover for each) 
Global Surface Water 30m Water occurrence 1984-2019 Yes, through specific 
1-100 product 
CC WB 150m Water/no water 2000/12 One-date product 
Daily Global Surface Water Change 500m Water 2001-2016 Yes 
Database 3 categories (Daily) 
Global 1-km Consensus Land Cover 1 km 12 themes (Map proportion Data fusion One-date product 
for each) 
Table 11 List of thematic supra-national LUC datasets 
LUC map Extent Spatial Thematic Timeframe Does it Description note 
resolution/Scale | information support 
change 
detection? 
TREES Vegetation | Amazon 1 km Vegetation 1992 One-date - 
Map of Tropical basin 14 categories map 
South America 
Circumpolar Arctic | Arctic region 1:7,500,000 Vegetation 1993/95 One-date - 
Region Vegetation 20 classes map 
Insular Southeast Insular 1 km 5 forest classes 1998/00 One-date Sect. 1 in "Supra- 
Asia—Forest Cover | Southeast map national Thematic Land 
Map Asia Use Cover Datasets" 
Continental Continental 1 km 8 forest/wood classes | 1998/00 One-date Sect. 2 in "Supra- 
Southeast Asia— Southeast map national Thematic Land 
Forest Cover Map Asia Use Cover Datasets" 
Central Africa— Cameroon 1-5 km Vegetation 1987/93 One-date = 
Vegetation map Central map 
African 
Republic 
Republic of 
Congo 
Equatorial 
Guinea 
Gabon 
DR Congo 
Congo Basin Congo River 57m Forest extent 1990/00 Information | Sect. 3 in “Supra- 
Monitoring Map Basin Forest probability on forest national Thematic Land 
Forest cover clearing clearing Use Cover Datasets” 
FACET DR Congo 60 m Forest 2000 Yes - 
Congo 3 cover 2005 
Gabon categories + gains 2010 
and losses 
MARS Crop Mask Africa 250m Cropland extent One date, different | One-date Sect. 4 in "Supra- 
Over Africa depending on the map national Thematic Land 
mapped area Use Cover Datasets" 
HRL Impervious Europe (EU) 10 m (after Extent and 2006, 2009, 2012, | Yes, through | Sect. 5 in “Supra- 
2018) percentage of 2015, 2018 layer of national Thematic Land 
20 m, 100 m impervious areas changes Use Cover Datasets" 


(before 2018) 


(continued) 
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Table 11 (continued) 
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LUC map Extent Spatial Thematic Timeframe Does it Description note 

resolution/Scale | information support 
change 
detection? 

HRL Forests Europe (EU) 10 m (after Percentage of tree 2012, 2015, 2018 | Yes, through | Sect. 5 in “Supra- 
2018) cover areas, leaf type layer of national Thematic Land 
20 m, 100 m and forest type changes Use Cover Datasets" 
(before 2018) 

HRL Grasslands Europe (EU) | 10 m (after Extent of grassland 2015, 2018 Yes, through | Sect. 5 in "Supra- 
2018) areas layer of national Thematic Land 
20 m, 100 m changes Use Cover Datasets" 
(before 2018) 

HRL Water Europe (EU) 10 m (after 5 water-wet classes 2015, 2018 Unknown Sect. 5 in “Supra- 
2018) national Thematic Land 
20 m, 100 m Use Cover Datasets" 
(before 2018) 

HRL Small Woody | Europe (EU) 5m Extent of Small 2015 Not at the Sect. 5 in "Supra- 

Features Woody Features moment national Thematic Land 

Use Cover Datasets" 

ESM Europe (EU) 2 m (2015) Built-up extent 2012, 2015 No Sect. 6 in “Supra- 
2.5 m (2012) (2015) national Thematic Land 
10 m (2012) Residential areas Use Cover Datasets" 


extent (2012) 

13 built-up 
categories (2012) 
Percentage of 
built-up areas 


(2012) 


They are usually produced by international institutions, 
such as the European Commission, or research groups from 
internationally renowned universities. They are interested in 
monitoring and understanding the land dynamism of high 
biodiversity areas of worldwide importance. 

The European Commission, through the Copernicus 
programme, is behind some of the few supra-national the- 
matic LUC datasets that focus on other covers such as 
artificial surfaces or agricultural areas. 


4 Reference Land Use Cover Data 


Reference data is required to train supervised remote sensing 
classifiers and to validate LUC maps. Reference LUC 
datasets consist of a series of geographically distributed 
sample points with LUC information. Each point contains 
information about the specific land use or cover in the pixel 
or polygon of the Earth's surface represented by the point. 

The reference datasets are subject to the same spatial 
abstraction required in LUC maps. Reference points are 
associated with a specific pixel or polygon. The level of 
abstraction required varies depending on the size of these 
points. The uncertainty of the reference information will also 


vary accordingly. The fact that a single land use or cover is 
assigned to a whole pixel or polygon, even though they may 
contain other land uses or covers, can also produce uncer- 
tainty. In addition, there is always a degree of subjectivity in 
the decision to assign a pixel or polygon to a particular 
category, especially in borderline cases that are not clear-cut. 
This can create an additional source of uncertainty. 

Relatively few general LUC reference datasets are cur- 
rently available. This is because many reference datasets 
were created ad hoc every time a new LUC map was vali- 
dated or reference data was required to train a remote 
sensing classifier, and it was therefore unnecessary to have a 
ready supply of general LUC reference datasets. These 
datasets are also affected by some degree of thematic gen- 
eralization, as is any LUC map. LUC information must 
conform to a specific classification system or legend. Given 
the ad hoc nature of many reference datasets, the classifi- 
cation or legend used to classify the land uses and covers 
was normally also case-specific. However, the recent emer- 
gence of standard LUC reference datasets aimed at a wide 
range of users and research fields has extended the use of 
standard legends and classification systems, such as the 
FAO LCCS, when drawing up these datasets. 
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One of the most renowned LUC reference datasets is the 
Land Use Cover Area frame Sample (LUCAS), produced by 
EUROSTAT every 3 years since 2006. It is made up of 
more than 330,000 survey points across the EU.^ An 
increasing number of countries have taken part in every new 
version of the survey. Of all the LUC reference datasets 
available, this is the most comprehensive. For each point, 
experts collect information about land uses, land covers and 
other relevant environmental parameters. LUCAS also 
includes four photographs for each surveyed point. It is the 
only LUC reference dataset reviewed that provides a 
coherent time series of data for different years. 

In recent years, various reference datasets used to validate 
and train classifiers of global LUC maps have been made 
available online, so enabling them to be used for other 
purposes rather than just in the production of one specific 
map. The work done by the team from the GOFC-GOLD 
Land Cover Office is of special note. They collected and 
improved the reference datasets from six different LUC 
products (GLC2000, GlobCover 2005, STEP, VIIRS, 
GLCNMO and the urban dataset from the University of 
Tokyo). Samples of these datasets (with up to 70% of all the 
available reference points) are freely available for download 
on the project website." 

There is a growing trend to gather reference data through 
crowdsourcing and volunteering initiatives. Information 
gathered in this way is often referred to as Volunteered 
Geographic Information (VGI) and is part of citizen science. 
Members of the public create reference LUC information 
that will later be used to train classifiers and validate final 
maps. The information is gathered by local volunteers across 
the world, so taking advantage of local expertise. It is also a 
good source of cheap reference information. However, 
production methods of this kind have many related limita- 
tions and uncertainties. 

The most famous of these initiatives is Geo- Wiki, which 
is frequently used to collect LUC information for calibration 
and validation practices. Geo-Wiki provides a user-friendly 
online tool that makes it very ease to visualize LUC maps 
and to collect the reference LUC data required to validate 
them. Many international research projects working on LUC 
mapping and citizen science have based their research on 
Geo-Wiki. One of the most important is the H2020 Land- 
Sense Citizen Observatory.? It produced a global LUC ref- 
erence dataset over four campaigns (Fritz et al. 2017). 


? https://ec.europa.eu/eurostat/web/lucas. 
^ http://www.gofcgold.wur.nl/sites/gofcgold, refdataportal.php. 
? https://landsense.eu/. 
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Sahariah et al. (2017) also produced a global LUC reference 
dataset for cropland land covers using Geo-Wiki and 
crowdsourcing. Both datasets are available online for any 
user interested in the PANGEA repository.? 

The Australian Terrestrial Ecosystem Research Network 
(TERN) has developed a specific Geo-Wiki application to 
validate Australian LUC maps: AusCover./ Also associated 
with Geo-Wiki, the LACO-wiki platform provides another 
tool for the collection of LUC reference datasets." Users can 
easily validate their own LUC maps on this platform, which 
includes a repository of reference data created or hosted by 
the community. It is a very comprehensive, user-friendly 
tool for LUC reference data production and LUC map val- 
idation, which has outperformed the capabilities of 
Geo-Wiki for this specific task. 

Many other tools and platforms have been developed in 
recent years with similar purposes: Collect Earth, GLFC LT, 
VIEW-IT... (Bey et al. 2016). However, although these 
platforms offer the tools required to create LUC reference 
datasets through crowdsourcing, many of these datasets are 
not made available online. Even in the platforms based on 
crowdsourced information, the LUC reference data remains 
very case-specific and is not disseminated, so preventing its 
reuse in other situations. 

Although they cannot be considered LUC data as such, 
volunteered geo-referenced photographs may be useful for 
obtaining reference LUC datasets. They provide a fixed 
picture of a landscape at a given point in time. By analysing 
the picture, users can identify the dominant land cover or 
land use, so obtaining LUC reference data. 

Several initiatives for collecting volunteered photographs 
of specific geographic locations are already ongoing. Flickr 
is one of the most famous, although its purposes and 
objectives have little to do with science or scientific meth- 
ods. The Degree Confluent Project (DCF? aims to collect 
photographs and descriptions of each integer degree inter- 
section of latitude and longitude on Earth. Geograph collects 
representative photographs of every single square km in 
England, Ireland? and Germany. The Field Photo 
Library? collects geo-referenced photos across the earth. 
Google Maps also hosts pictures and is now regarded as a 
successor to Panoramio, a service similar to Flickr. 


ê https://doi.pangaea.de/. 

7 https://application.geo-wiki.org/branches/auscover/. 
8 https://old.laco-wiki.net/en/Welcome. 

? http;//confluence.org/index.php. 

10 www geograph.org.uk/ 

!! https://geo-en.hlipp.de/ 

12 http://www.comf.ou.edu/photos/ 
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view of 21 global and 43 regional land-cover mapping 
products. Int J Remote Sens 36:5309—5535. https://doi.org/ 
10.1080/01431161.2015.1093195 


Comprehensive review of general LUC datasets available at 
global and continental scales. It also reflects on the progress 
made and the challenges that lie ahead, proposing a series of 
recommendations for future LUC mapping practice. 


Herold M, See L, Tsendbazar NE, Fritz S (2016) Towards an 
integrated global land cover monitoring and mapping sys- 
tem. Remote Sens 8:1—11. https://doi.org/10.3390/rs8 121036 


This paper summarizes the state of the art on global LUC 
mapping. It identifies the areas where most progress has been 
made in the field, referring in particular to the products with 
greater spatial detail and more frequent temporal information; 
the increasing importance of validation; the progressive 
implementation of the FAO Land Cover Classification Sys- 
tem (LCCS) framework as the standard LUC classification 
method; and the increasing interest in citizen engagement. 
The paper also mentions some of the specific fields that have 
recently been the focus of scientific attention: data fusion; 
uncertainty analysis by data comparison; and quantification 
of LUC change. Finally, the authors reflect on the work that 
remains to be done and the challenges that lie ahead. 


Mora B, Tsendbazar N-E, Herold M, Arino O (2014) Global 
Land Cover Mapping: Current Status and Future Trends. 
In: Manakos I, Braun M (eds) Land Use and Land Cover 
Mapping in Europe. Practices & Trends. Springer, Dor- 
drecht, Heidelberg, New York, London, pp 11—30. 


Book chapter offering a short but very comprehensive state 
of the art on global LUC mapping. It reviews the LUC 
datasets available in 2014 and summarizes the progress that 
had been made until then. It also points out the main issues 
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with regard to global LUC mapping practice and objectives 
for the future. Many of these objectives have now been 
accomplished. 


P. Giri C (ed) (2012) Remote sensing of land use and land 
cover. Principles and applications. CRC Press. 


One of the reference books on Land Use Cover mapping and 
analysis. It provides an introduction to the field, tracing its 
history and an overview of the main concepts relating to 
LUC mapping and remote sensing. It also addresses the main 
methodological issues in relation to LUC mapping using 
remote sensing techniques, such as validation practices, land 
cover change detection and image classification methods. In 
Part III, the book includes examples of regional LUC map- 
ping and LUCC monitoring. 


See L, Fritz S, Perger C, et al. (2015) Harnessing the power 
of volunteers, the internet and Google Earth to collect and 
validate global spatial information using Geo-Wiki. Technol 
Forecast Soc Change 98:324—335. https://doi.org/10.1016/. 
techfore.2015.03.002 


Good description of the Geo-Wiki platform, its history, 
evolution and current capabilities. It also reviews some of 
the LUC reference datasets based on information collected 
through the platform. 


Tsendbazar NE, de Bruin S, Herold M (2015) Assessing 
global land cover reference datasets for different user 
communities. ISPRS J Photogramm Remote Sens 103:95— 
114. https://doi.org/10.1016/j.isprsjprs.2014.02.008 


The paper compares and analyses 12 LUC reference datasets 
in detail. These datasets are used in the production and 
validation of global LUC maps. This is one of the most 
comprehensive reviews of the LUC reference datasets cur- 
rently available. It also assesses the potential reuse of these 
datasets, focusing on the data requirements imposed by 
different user communities. The authors try to identify the 
particular features that LUC reference datasets must have to 
enable them to be used by a wide range of users. 


Wulder MA, Coops NC, Roy DP, et al. (2018) Land cover 
2.0. Int J Remote Sens 39:4254-4284. https://doi.org/10. 
1080/01431161.2018.1452075 


A long but detailed reflection on the progress that has been 
made and the changes in Land Cover mapping since the 
appearance of remote sensing. 
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Abstract 


The increasing number of disciplines and public and 
private sectors interested in land use/land cover (LUC) in- 
formation has boosted the demand for and the production 
of related cartographic products. However, the commu- 
nicating power of the final maps may be impaired, if any 
of the cartographic transformations performed during the 
mapping process does not adapt well to the particular 
subject or area being mapped. This chapter takes the 
reader on a guided tour through the map production 
process, offering an overview of the cartographic lan- 
guage, the rules and practices that contribute to the 
success of the map as a communication tool and the most 
common forms in which LUC maps appear. Recent 
developments in geovisualization tools applied to LUC 
are also discussed. 


Keywords 
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Mapping 


1 Introduction 


The main purpose of cartography is to communicate 
geospatial information. The map serves as a channel through 
which a message is transmitted from the sender—the map- 
maker—to the receiver—the map user (Robinson 1953, 1969; 
Muller 1975; Koláény 1977; Ratajski 1978; Morrison 1976). 

Like any other communication tool, cartography pos- 
sesses its own language. The term “language” has been used 
by a number of authors in this field and can be defined as a 
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system of signs enabling communication (Cauvin et al. 
2010a). For communication to be successful, these signs 
should be capable of conveying to the reader the concepts 
that the author wishes to transmit. Given that maps also seek 
to convey information through signs, cartography must be 
considered part of semiotics. Indeed, as early as 1952, 
Robinson developed this idea by introducing a whole system 
of specific symbols for mapmaking (Robinson 1952). 

Subsequently, various studies explored this concept in 
greater depth, culminating in 1967 with the seminal piece by 
Jacques Bertin “Semiology of Graphics", a genuine world 
reference on this subject. This was followed in 1978 by 
Ratajski, who outlined that, in modern thematic cartography, 
the ultimate goal of semiotics is to build an accurate, 
unambiguous cartographic language. 

In cartography, semiotics unfolds as two different cate- 
gories of signs; on the one hand it refers to geometric signs, 
the spatial dimensions (zero, one, two or three) and the 
geometric nature of map features (points, lines, polygons and 
volumes), and on the other, to visual variables, defined as 
the possible elementary variations in perceptible marks 
(Bertin 1967). This definition was frequently cited, and 
eventually revised, by other cartographers (Cauvin et al. 
2010a; Robinson 1953; Robinson et al. 1984; Monmonier 
1993; Slocum et al. 2005). 

In this chapter we will be focusing on both kinds of signs 
and their role in the cartographic representation of land 
use/land cover (LUC). 

Recent technological advances in the GIS industry have 
popularized cartography, giving rise to what some people 
refer to as a "geospatial society" in which maps are 
increasingly ubiquitous and used in all kinds of applications. 
This has brought new opportunities for cartography as a 
science but it also poses new challenges, one of which is that 
many new mapmakers lack the necessary cartographic skills 
to produce effective maps. Unfortunately, there are numer- 
ous examples in the literature that illustrate the fact that GIS 
has made it easy to produce large numbers of wrong or 
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confusing maps more quickly than ever before. In the case of 
LUC mapping, no matter how sophisticated and expensive 
the technology for the collection and processing of the 
information may be, inexpert mapmakers often fail to 
communicate the relevant information correctly. 

In order to help overcome these issues, this chapter aims 
to provide the basic ground rules for the correct represen- 
tation and interpretation of LUC maps. 


2 Geometric Signs 


The geographic entities we find in the landscape are portrayed 
on maps as cartographic objects of varying geometric nature. 
Different land use areas are no exception and are usually 
depicted as polygons. The process for representing this 
information on a 2-dimensional piece of paper or on a screen is 
anything but simple as it involves, at least, the following 
transformations; (1) projecting the irregular and curved sur- 
face of the Earth on a plane, (2) selecting land use patches of 
sufficient size as to be visible (and readable) on the map, and 
(3) aggregating the information at the right administrative 
level when analysing LUC distribution over statistical spatial 
units. These three transformations have important implica- 
tions for LUC mapping, which we will now go on to explain. 


2.1 Cartographic Projection and LUC Mapping 
The representation of our curved planet on a 2-dimensional 
map requires the application of mathematical models, known 
as cartographic or mapping projections, to project the Earth's 
surface on a plane (Slocum et al. 2005). Deformations occur 
during the projection process, which provide differentiating 
criteria to enable us to classify these projections into three 
big families; conformal, equidistant and equivalent, the last 
of which is also referred to as equal area. 


e Conformal projections are used in navigation charts, as 
their main characteristic is the preservation of angles. 
Parallels and meridians intersect in a perpendicular 
manner, so forming four 90? angles at each intersection 
and an orthogonal network as a whole. However, these 
maps show important distortions in terms of the propor- 
tionality of areas and distances. 

e Equidistant projections preserve the distances between 
specific pairs of points and distort areas and angles. These 
kinds of projections are mainly used in engineering and 
construction works. 

e Equivalent or equal area projections preserve the pro- 
portionality of areas and by doing so distort the shapes 
and distances. 
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The bigger the area represented, the greater the impact of 
our choice of projection. This is noticeable in world maps 
where familiarity with the shapes of countries and continents 
make it easy for the reader to understand the deformations in 
each case. However, in smaller areas whose shape is not 
usually familiar to the general population, the map reader 
will find it difficult to notice the deformations. Of course, 
given the limited portion of the Earth's surface portrayed, the 
effects of the deformations are not as obvious as in world 
maps, but they do exist and can have an impact on LUC 
mapping. Since the choice of the projection results in sig- 
nificantly different maps, as Fig. 1 shows, the mapmaker 
must decide which projection system suits their map best. 
A bad choice could result not only in an unwanted distorted 
map, but also in a map that estimates metrics incorrectly. 
LUC analysts want metrics that inform the reader about 
different aspects of LUC, among them land use category 
distribution patterns and clusters, and especially the size of 
individual or groups of patches. This means that LUC maps 
must preserve the proportionality of areas. Conformal and 
equidistant projections are unsuitable for this purpose and 
equivalent projections must therefore be used. 


2.2 The Minimum Mapping Unit in LUC Maps 


The minimum mapping unit, or MMU, defines the size of 
the smallest cartographic object that will appear on the map 
(Cauvin et al. 20102), in this way determining the resolution 
and by extension the most appropriate scale for the map. 

Today, the predominance of digital maps over 
paper-based maps and their capacity to zoom in and out 
mean that the MMU is not as obvious as in the past. How- 
ever, all maps are affected by the mapmaker’s choices 
regarding their final scale, and the MMU has to be set in 
such a way as to facilitate the useability and readability of 
the map. In digital maps, the zoom feature may incorporate 
‘intelligent’ functions, which allow it to display certain map 
elements, features and labels, solely at the appropriate level 
of zoom. The result is that when the user zooms out, the 
smaller features are hidden and when they zoom in again, 
more and more small features become visible. For the 
intelligent zoom to work properly, the mapmaker must 
establish a different MMU at each zoom level, in this way 
deciding which elements will be visible at each different 
scale, an important decision in the mapmaking process. 

CORINE Land Cover is a well-known European project, 
which established an MMU of 25 hectares for areal entities 
and a minimum width of 100 m for linear features (European 
Environment Agency 2017). This means that in a printed 
map at the recommended working scale of 1:100,000 the 
MMU will occupy 0.5 cm? or 25 mm’. 
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Fig. 1 Impact of the cartographic projection on map appearance at 
global and local (Guadiamar River Basin) scales. a Mercator projection 
(conformal), b Mollweide projection (equal area), c ETRS89 / UTM 
zone 29N (conformal), d Mollweide (equal area), e Europe Equidistant 


The MMU also plays an important role in the data col- 
lection phase. Regardless of whether data is collected by 
field work or by interpretation of aerial or satellite imagery, 
the features that are smaller than the MMU will not appear 
on the map. 

Some authors work almost exclusively with raster struc- 
tures for which the pixel is the basic unit. As a result, they 
tend to conceive the MMU in terms of pixel size. From this 
perspective, it is generally accepted that the smallest 
observable feature in the final map, i.e. the MMU, should 
comprise at least four contiguous pixels (NOAA 2011). 

When it comes to determining the MMU of LUC maps, it 
is important to differentiate between databases and maps. 
Patches that might be a suitable size for data analysis could 
be completely inappropriate for map publishing. Single 
pixels or small groups of pixels forming small areas below 


Conic (equidistant). For demonstration purposes only, the differences 
between (d) and (e) have been accentuated by applying a World and a 
European projection system respectively 


the MMU threshold might be considered in data analysis, 
but would not appear on the map. 

Three intrinsic characteristics of LUC mapping must be 
taken into consideration when deciding the most appropriate 
MMU: (i) Confusion between use and coverage, (ii) Defini- 
tion of land use categories and associated land size, and 
(iii) High sensitivity of LUC maps to the interrelations 
between MMU and scale. The scale at which LUC infor- 
mation is expressed also has an enormous impact on the 
communication capacity of the resulting map (Wu and 
Harbin 2006; García-Álvarez et al. 2019). 

In what is a common confusion between land use and 
land cover, different MMUs can result in maps showing 
different categories. For instance, at a relatively coarse res- 
olution, a MMU of 1 km? would lead to an airport being 
depicted as such in both a land use map and a land cover 
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map. However, if we increase the resolution by reducing the 
MMU to 50 m, the land use map would still depict it as an 
airport, but the land cover map would classify the areas 
covered by runways, buildings, or green areas into different 
categories. 

The second characteristic of LUC information that affects 
the MMU is directly related to the first. The increasing 
availability of Earth observation products with greater spatial 
resolution could lead to the false idea that the higher the 
resolution of the images, the better the quality of the data 
obtained from them. However, land use, i.e. the “arrange- 
ments, activities and inputs people undertake in a certain 
land cover type to produce, change or maintain it" (Di 
Gregorio and Jansen 2000) cannot be observed in areas 
smaller than that required to carry out said activities and 
arrangements. For instance, the MMU for a LUC map cat- 
egory representing low-density residential development 
must be at least as small as the basic unit (house with gar- 
den) for this kind of land use. 

The third intrinsic characteristic of LUC information that 
impacts on the MMU is its nature as a covering phe- 
nomenon. Mapping LUC information involves the delimi- 
tation of areas showing homogeneous coverage. This poses a 
problem in the data collection phase of small-scale LUC 
maps, in which the MMU covers a significantly large area 
that probably includes several LUC categories. In these 
cases, the identification of homogenous areas becomes a 
much more complex task. In order to assign a single value to 
the area in question, the cartographer must apply one of the 
available criteria. The most frequently used criteria include 
allocating the area: (i) to the LUC category covering the 
largest proportion of the area or (ii) to the predominant LUC 
category in the surrounding area. Related issues arise when 
attempting to downscale or upscale previously existing 
geospatial information. This increases the uncertainty of the 
map (García-Álvarez et al. 2019) and could give rise to the 
Modifiable Areal Unit Problem (MAUP) and the Category 
Aggregation Problem (CAP). 


2.3 The Modifiable Areal Unit Problem (MAUP) 
and the Category Aggregation Problem 
(CAP) 


LUC can be mapped and conceptualized in different ways; 
from the most typical LUC maps in which the areas are 
classified into homogeneous categories, to choropleth maps 
which summarize, at selected administrative levels, different 
statistical values for the LUC they contain. In all cases, LUC 
information is expressed via polygon-based geometry but 
the MAUP is most noticeable in choropleth maps. 
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The MAUP was analysed in depth by Openshaw and 
Taylor (1979) and its effects have been tested in a number of 
research studies (García- Álvarez et al. 2019; Cebrecos et al. 
2018; Rajabifard et al. 2000). The MAUP appears when a 
specific variable is observed in spatial units of different 
levels within a hierarchical structure (Eagleson et al. 2002, 
2003). The MAUP causes two effects—zoning and scale. 
The first refers to the different patterns and associated sta- 
tistical measures resulting from different aggregation 
arrangements within the same hierarchical level. The second 
takes the form of new and different patterns of the analysed 
variable that appear when downscaling, i.e. when units are 
aggregated together to make larger units. 

LUC mapmakers and users need to be aware of the 
impact of the MAUP in order to facilitate both successful 
communication and well-informed decision-making. 

Another issue in relation to the downscaling of informa- 
tion is the Category Aggregation Problem (CAP), which was 
formulated more recently (Pontius and Malizia 2004). This 
problem refers to the important consequences of grouping the 
categories in a thematic legend together. This leads to the 
disappearance of certain subcategories from the legend, so 
complicating the analysis of the changes in these variables 
over time (García-Álvarez 2018). The aggregation of cate- 
gories also reduces the level of detail offered by the map. 

In LUC these constraints are key aspects in the correct 
production and analysis of related maps. Figure 2 illustrates 
some of these issues. At the scale used in these maps, the 
progressive categorical aggregation from left to right shows 
the need for larger MMUs. The most categorically detailed 
map is very difficult to read, while the most generalized map 
provides insufficient information. Setting the MMU there- 
fore entails a trade-off between the scale, the level of anal- 
ysis sought, and the number of categories. This means that 
both components (thematic and spatial) of the geographic 
information must be considered simultaneously when setting 
the MMU in LUC mapping. 


3 Visual Variables 


The expression ‘visual variable’ was used by J. Bertin 
(1967) to designate the components of a system of signs. 
Later on, Slocum et al. (2005) defined it as the variations and 
perceived differences in the signs used to represent a the- 
matic phenomenon. Other terms adopted by cartographers 
when referring to visual variables are symbol, graphical 
variable, graphical primitive or mark. Bertin identified six 
visual variables: shape, orientation, colour, value, grain and 
size, which have since formed the basis of studies of car- 
tographic semiotics (Slocum et al. 2005). 
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Fig. 2 Examples of LUC map information and issues arising from changes in the MMU and the aggregation of categories 


3.1 Shape 

Shape is the first variation distinguishable on any map. It 
helps identify the different types of objects appearing on a 
map, which are described by different contours. These 
contours may be regular and abstract (geometric signs) or 
figurative (pictograms). Shape corresponds to a nominal 
level of measurement and only allows us to convey either 
associations between objects with the same shape or differ- 
ences between elements represented by different shapes. 
Shape is neither ordinal nor quantitative and cannot therefore 
be used for thematic phenomena with ordinal or quantitative 
levels of measurement (Cauvin et al. 20102). 

In LUC mapping as in any other kind of polygon-based 
mapping, shape can only affect filling patterns, not the shape 
of the polygons themselves. The only exception to this rule 
are cartograms, in which both the size and the shape of 
polygon objects vary in line with quantitative thematic val- 
ues. In maps showing point and line features, shape is fre- 
quently used to highlight different associations between 
categorical objects. 


3.2 Orientation 


The orientation of a sign refers to its position relative to a 
reference framework and it is expressed in degrees (between 
0 and 360). As with shape, orientation can only represent the 
attributes on a nominal level of measurement and can only 
affect point-based elements (Cauvin et al. 20102). For line, 


polygon or volume geometries, the orientation would only 
affect the filling patterns (textures) chosen. It is used much 
less frequently than other visual variables, especially in LUC 


mapping. 


3.3 Colour Hue 


Colour hue (often referred to simply as colour) is the most 
complex visual variable and its use in maps has been exten- 
sively analysed by cartographers (Bertin 1967; Robinson 
et al. 1984; Monmonier 1993; Slocum et al. 2005; Cauvin 
et al. 20102). Colour varies depending on the light source, the 
reflective characteristics of the observed object and the human 
eye. The visible world is in fact composed of colourless 
matter but electromagnetic waves with different wavelengths 
are perceived as different colours by most people. 

As a visual variable on a map, unlike shape and orien- 
tation, colour can be used not only in points, but also in lines 
and polygons. As regards its properties in relation to the- 
matic information, colour is selective, separative and asso- 
ciative. Colour hues are neither ordered nor quantitative, 
which means they cannot be used to represent attributes 
measured at quantitative scales, and are therefore only 
suitable for representing phenomena measured at nominal 
scales. However, under certain conditions and when arran- 
ged in the appropriate order, colours can also be used to 
express order and opposition. For instance, yellow, orange 
and red can represent low, medium, and high data values, 
respectively (White 2017). 
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In addition to Bertin's pioneer work and the revisions to 
his visual variables made by subsequent authors, a milestone 
in the application of colour hue schemes in digital mapping 
is the ColorBrewer Tool developed by Cynthia Brewer at 
Penn State University (Brewer 2021). The ColorBrewer tool 
offers an extensive collection of colour ramps, which are 
well-suited for any measure of scale and for colour-blind 
map users. In terms of LUC mapping, an interesting pro- 
posal for colouring LUC maps with coarse pixel data can be 
found in Raposo et al. (2016). 

The use of colour in mapping is also affected by its cul- 
tural connotations. As pointed out by Hall (1971), signs and 
gestures have different, sometimes even contradictory 
meanings depending on the cultural background. One 
example is the connotations associated with red, as danger, 
versus green, as safety in western cultures. 

In addition to these cultural constraints, for map com- 
munication to be successful, the use of colour in mapping 
must honour some generally accepted conventionalisms. 
In LUC mapping, for instance, water bodies are always 
represented in light blue, while residential areas are normally 
depicted in red. 

A very useful, well-known colour scheme for LUC 
mapping was established by the European Environmental 
Agency in the Corine Land Cover project (EEA 2017). Its 44 
categories are represented by colours whose different hues 
are assigned to different groups of categories. In this way, 
artificial areas are represented in reds and purples, agricul- 
tural uses in yellow, forests in green, open spaces in grey and 
green, and wetlands and water bodies in blue. 


3.4 Colour Value 


White (2017) defined colour value as the lightness or dark- 
ness of a colour from pure black to pure white. Its variation 
constitutes “a continuous progression which the eye per- 
ceives in the grayscale stretching from black to white" 
(Bertin 1967) in a given area. Cauvin et al. (20102) noted 
that the term progression conveys the basic property of this 
visual variable—order. It can be expressed as the ratio of the 
respective quantities of black and white. 

As this is an excellent way of expressing order, it high- 
lights the differences in a hierarchical system. Even though it 
is frequently used to represent quantities, the human capacity 
to associate different colour values with different quantities 
is very limited. Today, however, digital mapping allows 
black to be allocated in amounts that vary in proportion with 
the thematic value, so making it possible to use value ramps 
that overcome this limitation. 
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Like colour hue, colour value can be used in all geometric 
forms, although the best results are obtained on an area or 
volume, as the map user requires a certain minimum amount 
of surface area to perceive the variations of grey. 

Since colour value is not suitable for representing nominal 
data, in LUC maps it is only used to summarize quantitative 
variables related to land use within administrative areas. 


3.5 Texture 


Texture or pattern is a complex visual variable that com- 
prises a varying number of components depending on the 
author you consult. According to White (2017), textures 
combine size, value, hue, shape and orientation. Other 
authors reduce these components to shape, arrangement, 
grain and spacing. Shape is the basic graphic unit making up 
texture. Arrangement refers to the layout of the basic graphic 
elements, either regular or irregular. Grain refers to the size 
of these elements and spacing to the distance between them. 
The use of textures for data measured at different levels is 
also controversial. While White recommends that textures 
only be used for nominal and ordered attributes of areas and 
lines, other authors (Cauvin et al. 20102) claim that they can 
also be used for quantitative data. 

Nowadays, textures are not used as often in mapping as 
they once were. In the past, when colour printing was sig- 
nificantly more expensive, textures were frequently used to 
fill out areas containing nominal, ordinal or quantitative 
information. Today textures have largely been replaced by 
colour. However, they are sometimes used in combination 
with other covering visual variables such as colour hue or 
colour value, so as to increase the amount of information 
provided by the map. 

Textures can be useful in LUC mapping when the basic 
LUC information is combined with other relevant informa- 
tion. In the case shown in Fig. 3, the area occupied by the 
Sierra de Guadarrama National Park in Spain has been 
texturized to differentiate it from the rest of the mapped area. 


3.6 Size 


Size is, together with colour value, the most frequently used 
visual variable for representing quantitative data. Size can be 
defined as the variation in the area or the volume of a sign. It 
is rarely used in LUC mapping as these maps are normally 
based on categorical data. Although in theory, size expresses 
quantity, order and selection (Bertin 1967), its use in rep- 
resenting qualitative information can lead to confusion. 
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Thus, size is only recommended for representing ordinal or 
quantitative data. 

As regards the geometries of the map, size can only be 
fully applied to points. In the case of lines, since the distance 
between two points is fixed, size can only be applied as a 
variation in line width. As for polygons, any variation in 
their size based on a quantitative attribute other than surface 
area would result in the loss of their cartographic projection 
properties. Given this constraint, when the nature of the 
attribute is such that its representation with size is recom- 
mended, polygons may be represented by a point, usually at 
their centroid, which varies in size according to the value 
attached to the polygon attribute. 

LUC map products using this visual variable are therefore 
limited to those summarising quantities such as the propor- 
tion of land occupied by each land use category, the pro- 
portion of land undergoing a land use change between two 
dates, or other related quantitative variables. In all cases, 
these quantities are summarized on a superimposed spatial 
structure, usually administrative units. 


3.7 Visual Variables and Geometric Dimension 


In the previous paragraphs, we have seen how some visual 
variables adapt better than others to the varying geometric 
forms in which geographical information is presented. 
Figure 4 summarizes the recommended use of the visual 
variables with different geometries. Green cells show opti- 
mal combinations, red cells show inapplicable combinations 
and yellow cells show the combinations that are subject to 
certain conditions. Points accept all visual variables with the 
exception of textures, although some points may be big 
enough to accommodate texture pattern. Given that lines are 
defined as the shortest distance between two points, they can 
only accept colour hue and colour value. However, a thick 
line can have different shapes and textures. As regards size, 
according to the above definition, lines can only vary in 
width, not in length. Polygons are more restrictive, in that 
they will only accept colour hue, colour value and texture. 
Any change in their shape, orientation or size would result in 
a distortion of the cartographic base which makes them 
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Fig. 4 Visual variables and geometric dimension 


unusable. However, these three visual variables could be 
applied to polygons when they (the visual variables) form 
part of the texture pattern that fills these polygons. 


3.8 Visual Variables and Measurement Level 


In the above descriptions of the visual variables, we also 
outlined the meanings with which they are associated, and 
consequently the most suitable level of measurement for 
them. In general terms, the visual variables that can be 
ordered (colour value, size, texture and colour hue if prop- 
erly ordered) are best suited for attributes measured at 
ordinal level. Visual variables indicating quantity (size and 
to some extent colour value and texture) can be applied to 
represent attributes at interval or ratio measurement levels. 
For their part, the visual variables with selective and asso- 
ciative properties, such as colour hue, shape and orientation, 
are used to represent attributes measured at nominal level. 

Orientation is a special case. It usually has the same 
meaning as shape, but under certain conditions it can also be 
used to represent ordered attribute series. For instance, an 
arrow symbol pointed at any angle in the 360? of a circle 
could be associated with an ordered attribute depicting every 
point in a hierarchy based on the angle of the arrow. 

As regards textures, their complex nature makes them 
suitable for any measurement level. Changes in the shape, 
orientation and colour hue of the pattern of elements that 
make up the texture would apply to attributes at nominal 
scale while size and colour value variations would be used to 
represent attributes at quantitative and ordinal measurement 
scales. Figure 5 summarises the recommended application of 
visual variables to represent attributes with different mea- 
surement levels. 


NOMINAL | ORDINAL | QUANTITATIVE 


ORIENTATION 
COLOUR HUE 
COLOUR VALUE 
TEXTURE 


Fig. 5 Visual variables and associated level of measurement 
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4 Representing Nominal LUC Data 


Most common LUC maps depict an area or region, high- 
lighting with different colours the homogeneous patches of 
the different LUC categories it contains. As described above, 
for these maps to serve as successful communication tools, 
they must comply with a series of cartographic rules. 

In terms of cartographic projection, the proportionality of 
areas must be preserved. If not, it would be impossible to 
compare the respective size of the different categories on the 
map. Equivalent projection must therefore be used. 

The final size of the map will determine the scale and 
therefore the size of the Minimum Mapping Unit. In the case 
of digital maps, we recommend that an intelligent zoom be 
used so that the map only displays features equal to or 
greater than the minimum size. As a fixed image, the final 
LUC map must also strike a balance between the MMU and 
the number of LUC categories. 

The visual variable best suited for categorical data is 
colour hue. Its use in LUC mapping must adhere to generally 
accepted conventions such as the use of blue colours to 
represent water bodies, reds and purples for built-up areas 
and so on. 

In line with these recommendations, Fig. 6 presents an 
example LUC map for the Guadiamar River Basin area in 
Southwest Spain based on Corine Land Cover data for the 
year 2000. 


5 Representing LUC Quantitative Data 


As pointed out above, the cartographic representation of 
LUC quantitative data requires additional layers, such as 
administrative units, for the computation of these quantities 
at a meaningful spatial level. Some sort of selection must be 
undertaken in order for the resulting maps to be readable. 
Figure 7 shows examples of the percentage of land occupied 
by natural, agricultural and artificial land use categories 
respectively. 

As with any map representing quantitative attributes, 
special attention must be paid to the number of intervals and 
their limits. An excessive number of intervals would make it 
difficult to differentiate between the associated symbols, 
regardless of whether they are based on size or colour value. 
By contrast, if too few intervals are used, this will reduce the 
level of detail of the information provided by the 
map. Brooks and Carruthers (1953) suggested that the 
number of classes should be less than or equal to five times 
the decimal logarithm of the number of observations. Other 
authors suggested that the number of classes should be equal 
to 3.3 times the decimal logarithm of the number of obser- 
vations plus 1 (Huntsberger 1961). In both cases the number 
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Fig. 6 Example of LUC 


map. Guadiamar River Basin, 


Spain 


of classes increases quickly in line with the number of 
Observations, making it difficult for the map reader to dif- 
ferentiate between the symbols. The average maximum 
number of different colour values that humans can perceive 
in a map is seven (Olson 1975) and, according to Robinson 
(1998), the optimum number is five. 
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The limits established for each of the intervals have a 
strong impact on the final appearance and usefulness of the 
map. There are a large number of possible methods for 
establishing these limits, but not all of them adapt to all sorts 
of data. The distribution of the thematic variable must be 
taken into account, as some methods are only suited to 
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Fig. 7 Quantitative maps showing the area occupied by different land use categories in the Guadiamar River Basin 


certain specific distributions. Following work by Monmonier 
(1982), Cauvin et al. (2010b) explained the details of the 
various different methods and analysed their advantages and 
disadvantages. In this chapter, we will be focusing on the 
main methods available in standard GIS software. The 
varying impact of three of the most common methods can be 


seen in Fig. 8. 
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Fig. 8 Impact of the classification method in quantitative maps 
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6 Representing Qualitative and Quantitative 


Pie charts enable the simultaneous communication of quali- 
tative and quantitative LUC data. The pie symbol can display 
variations in colour hue, colour value, size and texture. It can 
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represent nominal data by means of colour hue variations, 
while ordinal and quantitative data can be represented with 
size or colour value. Figure 9 shows the land occupied by 
natural, artificial and agricultural uses in the municipalities in 
the Guadiamar River Basin. Symbol size is proportional to the 
total area of the municipalities and the pie sections correspond 
to each of the categories coloured with a different hue. 


Fig. 9 Pie chart map 
representing the proportions of 
LUC categories in the 
municipalities in the Guadiamar 
River Basin 


Guadiamar River Bas 
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7 Representing LUC Changes 


One of the key areas in LUC studies is the analysis of the 
cover changes that have taken place in the past or are pre- 
dicted to occur in the future, according to different scenarios 
(White and Engelen 1993; Camacho Olmedo et al. 2018; 
Hewitt et al. 2014; Guzman et al. 2020). The methods 
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applied to undertake this analysis are usually based on the 
comparison of two input LUC maps with different dates. 

The cartographic representation of the LUC change that 
has taken place between these two dates is often expressed in 
terms of the amount of land gained or lost by each land use 
category. This is a quantitative attribute and is therefore 
subject to the constraints summarized in Sect. 5. 

As regards the representation of categories as nominal 
data, an excessively large number of land use categories in 
the input maps and their associated, theoretically possible 
transitions would in turn result in an excessively large 
number of new categories. This means that some kind of 
selection process must be performed. The options include: 
(i) reducing the map to the binary categories of "stable" and 
"changed" (ii) selecting just one land use category to rep- 
resent the areas gained or lost by it; and (iii) selecting the 
areas gained or lost by one specific land use category, in 
order to represent the land use categories from which or to 
which these areas have changed. 

In order to make the comparison, the two input maps 
must be overlaid. During this process, it is highly likely that 
new areas of varying size will appear on the output map. The 
issues relating to the MMU discussed in Sect. 2.2. apply to 
the representation or possible generalization of these new 
polygons. Figure 10 presents a composite of two input maps 
with LUC information for 1956 and 1999 respectively, an 
output map showing areas that have undergone LUC chan- 
ges between these dates and a second output map showing 
the main transitions that have taken place between LUC 
categories. 


8 New Forms of Visualizing 
and Communicating LUC Data 


Throughout the examples presented so far, we have made 
clear that LUC representation is a far from simple task and 
that LUC maps convey even the most relevant aspects of 
LUC information with difficulty. These limitations can have 
serious consequences when it comes to taking policy and 
land planning decisions. The abstract representation, nor- 
mally by means of colour hues, of land use categories or the 
transitions between them does not necessarily make it easier 
for users to understand the real landscape changes they 
represent. Policy makers may not be expert map users, and 
will therefore require more intuitive information in order to 
fully comprehend the impacts of predicted land use changes 
on landscapes, economy, society and the environment. Van 
Lammeren et al. (2010) found that users complained about 
an excessive amount of detail on A4-size printed maps, that 
the colours were too close, and that it was difficult to com- 
pare the maps. 
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In an attempt to alleviate these issues, various interesting 
case studies have integrated new approaches to cartographic 
visualization (Cauvin et al. 2010c) such as realistic 3D 
models (Appleton et al. 2002; Paar 2006; van Lammeren 
et al. 2010), and have explored the use of historic photog- 
raphy to illustrate land use changes (Kull 2005). 

In addition to these realistic 3D examples, technological 
developments in the mapping industry have enabled the 
production of new cartographic tools that have yet to be 
explored in the communication of LUC information. Three 
areas are in need of further research and implementation. 
First, the current predominance of digital maps that are 
viewed through a computer device equipped with speakers, 
contrasts with the almost complete absence of research into 
sound mapping applied to LUC analysis. Second, the limited 
interactive capacity of LUC digital maps makes it difficult to 
compare them. And third, the possibilities offered by the 
computerised environment for visualizing animations, per- 
haps the most efficient tool for communicating changes over 
time, have yet to be applied in LUC change studies. 


9 Conclusions 


In this chapter we have reviewed the main cartographic 
methods for representing and communicating LUC and 
LUCC information. The maps produced must comply with 
basic cartographic rules and must therefore have an appro- 
priate cartographic projection, a balanced level of general- 
ization, MMU and attribute details, a suitable set of visual 
variables and, in the case of quantitative data, a proper 
method for the classification of the thematic variable. 

Even the maps that comply with these rules are often not 
fully comprehensible for their final users. This may be 
because the scale used in the final printed maps, the format 
in which most decision-makers receive the information, is 
too small or simply because not all the actors involved 
"speak" the cartographic language. 

In order to overcome these issues, new cartographic 
methods including geovisualization techniques like realistic 
3D mapping, are being explored. Other technological 
advances like sound mapping, fully interactive mapping or 
animated mapping are still underused in LUC studies. The 
integration of realistic 3D models with animation and sound 
will enable the inclusion of moving living creatures (like 
animals or people) human-made moving objects (like 
vehicles or windmills), vegetation, topography, buildings, 
and variations in the atmosphere or the light. Progress of this 
kind in LUC representation will make LUC maps more 
realistic and will enhance their communication capabilities, 
which in turn will help ensure  better-informed 
decision-making processes. 
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Fig. 10 Cartographic representation of LUC changes 
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Data Sources The author produced all figures included in 
this chapter for the purpose of this book. Data sources used 
are: 


— Spanish National Mapping Agency: Instituto Geográfico 
Nacional (IGN) at www.ign.es; 

— Spanish Agency for National Parks: Organismo Autón- 
omo de Parques Nacionales (OAPN) at https://www. 
miteco.gob.es/es/parques-nacionales-oapn/; 

— Spanish National Bureau of Statistics: Instituto Nacional 
de Estadística (INE) at https://www.ine.es/; and 

— European Environment Agency (EEA) at https://www. 
eea.europa.eu/. 


References 


Appleton K, Lovett A, Sünnenberg G, Dockerty T (2002) Rural 
landscape visualisation from GIS databases: a comparison of 
approaches, options and problems. Comput Environ Urban Syst 
26(2-3):141-162 

Bertin J (1967) Sémiologie graphique. Les diagrammes, les réseaux et 
les cartes, Mouton, Gauthier-Villars, Paris, p 431 

Brewer C (2021) ColorBrewer Tool. http://www.ColorBrewer.org. 
Accessed 8 July 2021 

Brooks CEP, Carruthers N (1953) Handbook of statistical methods in 
meteorology. London, p 412 

Camacho Olmedo MT, Paegelow M, Mas J-F, Escobar F (eds) (2018) 
Geomatic approaches for modeling land change scenarios. Springer 
International Publishing, p 525 

Cauvin C, Escobar F, Serradj A (2010a) Thematic cartography and 
transformations. Wiley, London, p 463 

Cauvin C, Escobar F, Serradj A (2010b) Cartography and the impact of 
the quantitative revolution. Wiley, London, p 408 

Cauvin C, Escobar F, Serradj A (2010c) New approaches in thematic 
cartography. Wiley, London, p 291 

Cebrecos A, Domínguez-Berjón MF, Duque I, Franco M, Escobar F 
(2018) Geographic and statistic stability of deprivation aggregated 
measures at different spatial units in health research. Appl Geogr 
9:9-18 

Di Gregorio A, Jansen L (2000) Land Cover Classification System 
(LCCS): Classification concepts and user manual, FAO, http://ww. 
fao.org/3/x0596e/X0596e00.htm#P-1_0. Accessed Ist July 2021 

Eagleson S, Escobar F, Williamson I (2002) Hierarchical Spatial 
Reasoning theory and GIS technology applied to the automated 
delineation of administrative boundaries. Comput Environ Urban 
Syst 26(2002):185-200 

Eagleson S, Escobar F, Williamson I (2003) Automating the admin- 
istration boundary design process using hierarchical spatial reason- 
ing theory and geographical information systems. Int J Geogr Inf 
Sci 17(2):99-118 

European Environment Agency (2017) Technical specifications for the 
CORINE Land Cover (CLC) pilot projects implemented in the 
Eastern Partnership countries (2017—2019), Implementation of the 
Shared Environmental Information System principles and practices 
in the Eastern Partnership countries (ENI SEIS II East), p 15 

García-Álvarez D (2018) Aproximación al estudio de la incertidumbre 
en la modelización del Cambio de Usos y Coberturas del Suelo 
(LUCC). PhD Dissertation. Universidad de Granada, p 445 


F. Escobar 


García-Álvarez D, Camacho Olmedo MT, Paegelow M (2019) 
Sensitivity of a common Land Use Cover Change (LUCC) model 
to the Minimum Mapping Unit (MMU) and Minimum Mapping 
Width (MMW) of input maps. Comput Environ Urban Syst 
78:101389 

Guzman LA, Escobar F, Pefia J, Cardona R (2020) A cellular 
automata-based land-use model as an integrated spatial decision 
support system for urban planning in developing cities: the case of 
the Bogotá region. Land Use Policy 92:1-13 

Hall ET (1971) La dimension cachée, Seuil, Paris, 1st edn, p 256, 1966, 
2nd edn, p 254 

Hewitt R, van Delden H, Escobar F (2014) Participatory land use 
modelling, pathways to an integrated approach. Environ Model 
Softw 52:149-165 

Huntsberger DV (1961) Elements of statistical inference. Allyn & 
Bacon, Boston, p 528 

Koláény A (1977) Cartographic information. A fundamental concept 
and term in modern cartography, Cartographica, monograph no 19, 
pp 39-45 

Kull CA (2005) Historical landscape photography as a tool for land use 
change research. Norsk Geografisk Tidsskrift-Norwegian J Geogr 
59:253-268 

Monmonier MS (1982) Flat laxity optimisation and rounding in the 
selection of class intervals. Cartographica 19(1):16—27 

Monmonier MS (1993) Mapping it out. Expository cartography for the 
humanities and social science. University of Chicago Press, 
Chicago, p 301 

Morrison JL (1976) The science of cartography and its essential 
processes. Int Yearbook Cartogr 16:85-97 

Muller JC (1975) Association in choropleth map comparison. Annals 
Ass Am Geogr (AAAG) 65(3):403-413 

NOAA (2011) Digital Coast GeoZone. Tech talk for the Digital Coast. 
https://geozoneblog.wordpress.com/201 1/10/28/resolution-vs- 
minimum-mapping-unit-size-does-matter/ Accessed 1 July 2021 

Olson JM (1975) Experience and the improvement of cartographic 
communication. Cartogr J 12:94—108 

Openshaw S, Taylor PJ (1979) A million or so correlation coefficients: 
three experiments on the modifiable areal unit problem. In: 
Wrigley N (ed) Statistical methods in the spatial sciences. Pion, 
London, pp 127-144 

Paar P (2006) Landscape visualizations: applications and requirements 
of 3Dvisualization software for environmental planning. Comput 
Environ Urban Syst 30(6):815—839 

Pontius RG, Malizia NR (2004) Effect of category aggregation on map 
comparison. In: Egenhofer MJ, Freksa C, Miller HJ (eds) Third 
international conference of the geographic information science 
(GIScience). Springer, Adelphi, pp 251—268 

Rajabifard A, Escobar F, Williamson I (2000) Hierarchical spatial 
reasoning applied to spatial data infrastructures. Cartography 29 
(2):41-50 

Raposo P, Brewer C, Sparks K (2016) An impressionistic cartographic 
solution for base map land cover with coarse pixel data. Cartogr 
Perspect. https://doi.org/10.14714/CP83.1351 

Ratajski L (1978) The main characteristics of cartographic communi- 
cation as a part of theoretical cartography. Int Yearbook Cartogr 
18:21-32 

Robinson AH (1952) The look of maps: an examination of cartographic 
design. University of Wisconsin, Madison, US, p 105. Reprinted in 
The American Cartographer, 13:3, 280, (1986) 

Robinson AH (1953) Elements of cartography, 1st edn. John Wiley & 
Sons, New York, p 245 

Robinson AH, Sale R, Morrison J, Muehrcke PC (1984) Elements of 
cartography, 5th edn. John Wiley & Sons, New York, p 541 

Robinson AH (1969) Elements of cartography, 2nd edn. John Wiley & 
Sons, New York, p 333 


Visualization and Communication of LUC Data 


Robinson GM (1998) Methods & techniques in human geography. 
John Wiley & Sons, England, p 390 

Slocum TA, Mcmaster RB, Kessler FC, Howard HH (2005) Thematic 
cartography and geography visualization. Pearson Prentice Hall, 
p 518 

van Lammeren R, Houtkamp J, Colijn S, Hilferink M, Bouwman A 
(2010) Affective appraisal of 3D land use visualization. Comput 
Environ Urban Syst 34:465-475 


Open Access This chapter is licensed under the terms of the Creative 
Commons Attribution 4.0 International License (http:// 
creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as 
long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if 
changes were made. 


83 


White T (2017) Symbolization and the visual variables. the geographic 
information science & technology body of knowledge, 2nd Quarter. 
In: Wilson JP (ed) https://doi.org/10.22224/gistbok/2017.2.3 

White R, Engelen G (1993) Cellular dynamics and GIS: Modelling 
spatial complexity. Geograph Syst 1:237—253 

Wu J, Harbin L (2006) Concepts of Scale and Scaling. In: Wu J, 
Jones KB, Li H, Loucks OL (eds) Scaling and uncertainty analysis 
in ecology: methods and applications. Springer, Dordrecht, The 
Netherlands, pp 3-15 


The images or other third party material in this chapter are included in 
the chapter's Creative Commons license, unless indicated otherwise in 
a credit line to the material. If material is not included in the chapter's 
Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to 
obtain permission directly from the copyright holder. 


(R) 


Check for 
updates 


Miguel Ángel Castillo-Santiago, Edith Mondragón-Vázquez, 
and Roberto Domínguez-Vera 


Abstract 


We present an approach that is widely used in the field of 
remote sensing for the validation of single LUC maps. 
Unlike other chapters in this book, where maps are 
validated by comparison with other maps with better 
resolution and/or quality, this approach requires a ground 
sample dataset, i.e. a set of sites where LUC can be 
observed in the field or interpreted from high-resolution 
imagery. Map error is assessed using techniques based on 
statistical sampling. In general terms, in this approach, the 
accuracy of single LUC maps is assessed by comparing the 
thematic map against the reference data and measuring the 
agreement between the two. When assessing thematic 
accuracy, three stages can be identified: the design of the 
sample, the design of the response, and the estimation and 
analysis protocols. Sample design refers to the protocols 
used to define the characteristics of the sampling sites, 
including sample size and distribution, which can be 
random or systematic. Response design involves estab- 
lishing the characteristics of the reference data, such as the 
size of the spatial assessment units, the sources from which 
the reference data will be obtained, and the criteria for 
assigning labels to spatial units. Finally, the estimation and 
analysis protocols include the procedures applied to the 
reference data to calculate accuracy indices, such as user's 
and producer's accuracy, the estimated areas covered by 
each category and their respective confidence intervals. 
This chapter has two sections in which we present a couple 
of exercises relating to sampling and response design; the 
sample size will be calculated, the distribution of sampling 
sites will be obtained using a stratified random scheme, and 
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finally, a set of reference data will be obtained by 
photointerpretation at the sampling sites (spatial units). 
The accuracy statistics will be calculated later in Sect. 5 
in chapter “Metrics Based on a Cross-Tabulation Matrix 
to Validate Land Use Cover Maps" as part of the 
cross-tabulation exercises. The exercises in this chapter 
use fine-scale LUC maps obtained for the municipality of 
Marqués de Comillas in Chiapas, Mexico. 


Keywords 


Single map validation * Sample size * Sampling design * 
Systematic sampling * Random sampling * Reference 
data 


1 Sample Size Estimation and Spatial 
Distribution of Sampling Sites 
in a Stratified Randomised Design 


When conducting error assessment, it is important to strike a 
balance between the theoretical requirements and the prac- 
tical reality of implementation (Congalton 1991). In the map 
validation process, it is therefore crucial to have the right 
sample size and to use the right number of spatial units in 
order to ensure that reliable accuracy indices can be obtained 
without incurring high costs. 

There is no single right way to calculate the ideal sample 
size; in general, this task could be regarded as a process of 
successive approximations, in which criteria such as the 
availability of resources, levels of sampling error, or the 
desired degree of accuracy all play an important role. The 
expertise of the user and his/her interest in certain thematic 
classes are also important factors in the success of the esti- 
mation process. 

An initial estimation of the most appropriate sample size 
can be made with the formulae used in statistical sampling. 
Equations for the validation of thematic maps have often 
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been taken from the original work by Cochran (1977) and 
for a simple, stratified randomised design, Stehman and 
Foody (2019) propose the following formula: 


2 

O(1—O 
where O - accuracy expressed as a proportion (in the case 
of simple random sampling O is the anticipated overall 
accuracy, whilst in stratified sampling it is the anticipated 
user's accuracy); n = number of sampling sites; z = per- 
centile from the standard normal distribution (z = 1.96 for a 
95% confidence interval); and d = desired half-width of the 
confidence interval of O. It can also be expressed as 
z*$ (0) i 

In the case of stratified random sampling, Olofsson et al. 

(2014) recommend the following formula: 


2 
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where S(O) = standard error of estimated overall accuracy; 
Wi = mapped proportion of the area of class I; Si = standard 
deviation of class i, Si = J/Ui(1 — Ui); and Ui = User's 
accuracy for class i. 

Note that in both cases, it is necessary for the user to 
define certain parameters in advance, such as the permissible 
level of error (S(O)) or the user's accuracy values. These 
data should be obtained from prior or approximate knowl- 
edge regarding the quality of the map or from previous 
experience in producing maps with similar characteristics. 

Sometimes it may be difficult to estimate user's accuracy, 
so practical recommendations for sample size calculation 
may be useful. Hay (1979) proposed allocating 50 sampling 
sites per thematic class. Congalton (1988, 2016), based on a 
series of Monte Carlo simulations, also recommended allo- 
cating 50 sampling sites per thematic class but only when 
map extent was under 500,000 ha and there were 12 or 
fewer thematic classes. In more complex situations, i.e. 
when the map extent was over 500,000 ha or it had more 
than 12 classes, he proposed allocating 75-100 samples per 
thematic class. According to his approach, therefore, total 
sample size is dependent on the size of the thematic map and 
the number of classes it contains. 

Sampling design is another important factor to consider. 
Frequently used types include systematic, simple random 
and stratified random sampling. Traditionally, the cost or 
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ease of fieldwork was a criterion for preferring some designs 
over others. With the increased availability of 
high-resolution imagery, in many cases, it is no longer 
essential to obtain data directly in the field. Reference data 
can now be interpreted from the imagery, so reducing costs 
dramatically. 

The systematic sampling is easier to implement in the 
field, but has the disadvantage that it cannot be used to 
construct an unbiased variance estimator (Stehman 2009). 
Randomised designs can be more effective at estimating 
accuracy parameters (Congalton 1988) and can adapt more 
easily to changes in sample size without losing their prob- 
ability sampling character (Stehman 2009). In the case of 
stratified random sampling, once the sample size has been 
calculated, rules must be established to allocate the sampling 
sites to each of the strata or thematic classes. These rules 
normally apply one of the following criteria: an equal 
number of sites in each class, a number proportional to the 
size of the class, or a number that depends on both the size of 
the class and the expected user’s accuracy for this class 
(optimal allocation). The allocation criterion affects the 
precision of some of the accuracy parameters. For example, 
with optimal allocation, the variance of the overall accuracy 
and the user’s accuracy for rare classes decreases. By con- 
trast, with equal distribution, more precise estimates of the 
user’s accuracy for rare classes can be obtained, whilst in 
large classes, the precision decreases (Stehman 2012). 

Regardless of the design chosen, a problem that sometimes 
arises is the under-representation of small or rare thematic 
classes in the sample. In other words, once the sample has 
been calculated and distributed, it may leave some classes 
with too few sites («50). Some authors (Olofsson et al. 2014; 
Finegold et al. 2016) suggested a two-step solution for the 
specific case of stratified random sampling. First, calculate 
and distribute the sample according to the proportional or 
optimal criteria, and if any class turns out to have a small 
number of sampling sites («50), then allocate 50 sites to it and 
recalculate the total sample size. 

Once all the different stages of the accuracy assessment 
process have been performed, the precision values obtained 
should be reviewed, e.g. the magnitude of the overall 
accuracy standard error or the width of the user's accuracy 
confidence intervals. Even if there are some variations from 
the expected values, if the values obtained meet the analyst's 
targets as regards accuracy, then there would be no need to 
repeat the analysis (Stehman and Foody 2019). If not, it 
would be necessary to try again, adding new sites to increase 
the sample size. 


Sample Data for Thematic Accuracy Assessment in QGIS 


QGIS Exercise: To calculate sample size and to 
distribute sample sites using a random stratified 
approach 


Next, we present a practical way to carry out the sampling 
design for obtaining reference data. In this exercise, we 
will estimate the sample size for a stratified random design, 
for which we will have to specify the expected standard 
error of the overall accuracy and to provide an a priori 
estimate of user's accuracy values. Sometimes, these fig- 
ures may be difficult to provide in which case we can use 
the default values provided by the tool we will be using. 


Available tools 


* MapAccurAssess plugin 
* Semi-Automatic Classification Plugin (SCP) 
* AcATaMa plugin 


There are several useful tools in QGIS for statistical sam- 
pling design. All of them are external plugins such as 
Semi-Automatic Classification Plugin (SCP), ACATaMa and 
MapAccurAssess. The MapAccurAssess plugin is a trial 
version specifically developed in the context of this book, 
which is not yet available in the official QGIS repositories. 

SCP, which was developed by Congedo (2020), is a 
toolset for the classification and validation of land-cover and 
land use maps. With this plugin, the sample size and the 
allocation to each class must be calculated externally using a 
spreadsheet or other software. Once the number of sites per 
class has been defined, the plugin allows for a random dis- 
tribution per thematic class. The size of the spatial units is 
indicated in number of pixels. Both the map and the samples 
must be in raster format. 

AcATaMa was developed by the Group from the Forest 
and Carbon Monitoring System for the validation of single 
LUC maps (Llano 2019). It consists of a set of tools that 
guide the user through a series of steps: (a) sampling design 
(stratified or simple); (b) sample classification; and (c) cal- 
culation of the confusion matrix and accuracy statistics. In 
the sample classification step, the spatial unit is a pixel (or 
points in the GeoPackage or shapefile format), which is not 
very convenient for those who prefer to use a different 
spatial unit, such as group of pixels or polygons. At that 
stage (classification), a set of tools is enabled to zoom in on 
each of the samples, and four windows are created to display 
images of interest. An editable attribute table is also created 
to classify the samples. 

In this exercise, we will be using MapAccurAssess, a 
plugin developed by the authors of this chapter, which 
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includes several of the suggestions proposed by Olofsson 
et al. (2013) and Finegold et al. (2016). It is available at 
https://doi.org/10.5281/zenodo.5419130 with its associated 
documentation. For more information on the plugin, readers 
are referred to Chapter *About This Book". 

This plugin provides several functions for calculating 
sample size in a stratified random design, using Neyman's 
optimal allocation to calculate the number of sampling sites 
in each thematic class. If, after that, any class has less than 
50 sampling sites, it must be assigned between 50 and 100 
sites depending on the complexity of the map. The result is a 
layer of points (shapefile) that are distributed over all the 
thematic categories of the map according to the stratified 
random design criteria. The points can be further modified to 
represent a polygon using QGIS functions. 


Materials 

Marqués de Comillas Land Use Cover Map 2019 
Requisites 

To calculate the area of each thematic class the LUC map 
must be projected in any cartographic projection (not geo- 
graphic coordinates). The plugin has been tested on map 


projections with distance units in metres (rather than feet for 
example). 


Execution 

Step 1 

Install the MapAccurAssess plugin. All the relevant infor- 
mation regarding the installation of the plugin can be found 
in Chapter “About This Book" and the plugin's manual, 
which is included in the plugin's download. 

Step 2 

Go to the Plugins Menu, select the Accuracy Assessment 


and Random point options. Alternatively, you can click on 
the Random Point icon (Fig. 1). 


Fig. 1 Exercise 1. Step 2. P 
MapAccurAssess plugin icon e^ " pa e 
E o 9* 
TAA. 
LJ 
LJ P o d 
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o . 
v e o 
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Fig. 2 Exercise 1. Step 3. 
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x 


Input parameters 
Thematic map: Thematic map: raster file 
(tif). 
C:/metrics-from-cross-tabulation-matrix/LandCoverMarques20 19. tif M 


Minimum separation between 
centres of sampling sites (m) 
100 


UI: 


Classes 
1,50 6 
2,51 6 
3 130 6 
4| 161 6 
5.261 6 


Step 3 


In the dialogue box in Fig. 2, fill in the map filename (LUC 
map of Marqués de Comillas) and modify or accept the sug- 
gested values. A minimum distance between the centres of the 
spatial units must be specified in order to prevent overlapping, 
e.g. if the spatial units are square polygons of one ha, the 
minimum distance between their centres must be 100 m. 

The Ui values (User's Accuracy for class i) refer to an a 
priori estimation of accuracy for the thematic class, which 
could be based on expert judgement or on previous assess- 
ments. If there are any doubts about these values, the default 
values can be retained. Whilst Ui can vary between 0 and 1,a 
value of 0.5 was allocated to a large number of sites. Values 
of over 0.5 will generate a smaller sample size. The last stage 
is to select the folder where the results will be saved. 


Results and Comments 


The results are displayed in the Record tab, and two types of 
output are generated and saved in the selected folder: (1) a. 
csv file with statistics about the thematic classes and (ii) a 
point shapefile where the points represent the centres of the 
sampling sites. 


Accuracy value (0-1) 


Minimum separation 
between centres of 


Minimum number of sampling - z 
sites in small dasses. (50 to 100) sampling sites (m): 
geeneen minimum separation 


distance for each site. 


Minimum number of 

^ sampling sites in small 
classes: In order to be 
able to properly compute 
the accuracy statistics, a 
thematic dass must have at 
least 50 samping stes. If 
the map has more than 12 
dasses or extent more than 

M 500000 ha use values 
ranging from 50 to 100. 


Ui: estimated accuracy of 
dass i, values [0,1] 


P 


0% 


Cancel Accept 


The .csv file contains a row for each thematic class and 
four columns showing id, area, the number of sampling sites 
estimated using Neyman’s optimal criteria and the suggested 
number of sampling sites, adjusted to ensure that none of the 
classes have less than 50 sites (Table 1). If the area covered 
by a particular class is so small that 50 sites cannot be placed 
on it, the adjusted value will also be less than 50. Classes 
like this should be merged into other similar classes. 

The vector point layer contains the spatial location of the 
centres of randomly distributed sampling sites (Fig. 3). Each 
point has two attributes, a unique identifier (id) and the 
thematic class value recovered from the LUC map. 


2 Collection of Reference Data for Assessing 
the Accuracy of a Thematic Map 


One of the major challenges of map evaluation is to obtain a 
reliable reference dataset with minimal positional errors and 
with the same date as the LUC maps. The aim is to obtain a 
data subset that faithfully represents the population from 
which it was extracted, so as to obtain confident accuracy 
estimates (Stehman 2009). 


Sample Data for Thematic Accuracy Assessment in QGIS 


Table 1 Results from Exercise 
1. Number of sampling sites per 
thematic classes 


Classes 

50 

51 
130 
161 
261 
290 
301 
420 


ND tn FW NY KF C 


89 
Area (ha) samples_neyman samples_adjusted 
26,009 TI TI 
32,875 98 98 
252 0 50 
6943 20 50 
13,504 40 50 
116,429 347 347 
2357 7 50 
2021 6 50 
eee 
4 = S 
id class 
1 162960 50 
2 251457 50 
3 230682 50 
4 181903 50 
JJ 166180 50 
6 299244 50 
z 304928 50 
8 238240 50 
9 112053 50 
10 231853 50 
11 | 114530 50 
12 257119 50 
13 90022 50 
14 257515 50 


Fig. 3 Result from Exercise 1. Map showing the spatial distribution of sampling sites and the corresponding table of attributes 


The collection of reference data requires the prior defi- 
nition of several aspects relating to the size of the sampling 
area and the characteristics of the information we want to 
obtain (Olofsson et al. 2014): (a) characteristics of the spatial 
assessment units; (b) sources of reference data; (c) labelling 
protocol; and finally (d) classification agreements. Spatial 
assessment units refer to the sampling areas where the ref- 
erence and map values are compared. Traditionally, the 
chosen spatial unit was a pixel or a polygon, or even a group 
of pixels, although there is still no consensus regarding the 
best size (Stehman and Czaplewski 1998; Olofsson et al. 
2014; Stephen and Wickham 2011). What is certain is that 


various factors must be taken into account. For example, 
when a pixel is used as the spatial unit, it must be decided 
whether the land-cover label to be assigned will be exclu- 
sively what is observed on each individual pixel or whether 
the surrounding context will be taken into account, so as to 
reduce possible georeferencing errors. If we use a polygon or 
group of pixels, it will be necessary to define their size, for 
example, one hectare or blocks of three by three pixels. The 
advantage of using an area larger than one single pixel is that 
the incorrect assignment of labels due to georeferencing 
errors is minimised. The major drawback is that each spatial 
unit can contain several different land-cover classes, which 
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means that rules must be drawn up to assign the land-cover 
to the right class (Stehman and Wickham 2011). The mini- 
mum mapping unit of the map must also be taken into 
account, given that the spatial unit must not be smaller than 
the minimum mapping unit. In the end, each user will have 
to opt for the spatial unit size that best suits his or her 
purposes. 

The reference source can be either observed field data or 
data interpreted from satellite imagery and aerial pho- 
tographs. Although data collected in the field is always 
preferable, this method is much more expensive, and the 
interpretation of aerial photographs and satellite images is 
often regarded as an acceptable alternative. In this case, it is 
important to ensure that the reference data has a higher 
quality and resolution than the images used in the initial 
mapping process. The labelling protocol should be the same 
as that used in the mapping, i.e. the land-cover classes or 
types of change, and the photointerpretation criteria for 
labelling the sampling sites should be the same as those used 
when drawing the map being assessed. 

When the reference data are obtained from satellite ima- 
gery, there is a degree of uncertainty associated with the 
level of expertise of the photointerpreter. This uncertainty 
can be reduced if classification criteria are established before 
obtaining the reference data. To minimize interpreter bias, 
we suggest that at least two specialists perform the class 
assignment independently. When different labels are 
assigned to the same sampling unit, a third interpreter must 
decide which class it should be assigned to. 

It is also necessary to establish the criteria for dealing 
with non-ideal situations. When the spatial reference unit, 
defined as a set of pixels, contains several different 
land-cover classes we suggest, when possible, assigning the 
reference unit to the majority category, representing more 
than 5096 of the area. Another complex situation could be 
when the reference unit contains a linear feature or corridor 
which is assigned to several different land-cover classes. In 
this case, we suggest moving the sampling site to another 
place in which there is less uncertainty regarding class 
allocation. The producer and the person(s) assessing the map 
should always reach agreement on such decisions and doc- 
ument them, so as to avoid biases in the accuracy 
assessment. 


QGIS Exercise: To collect reference data 


This exercise is a guide to collecting reference data. Instead of 
fieldwork, high-resolution satellite imagery, available on Goo- 
gle Earth, is used together with various QGIS tools. The result 
of this exercise is a set of comparisons of land-cover 


M. Á. Castillo-Santiago et al. 


observation taken from the high-resolution image (reference 
data) and the land-covers extracted from the LUC map under 
evaluation. The output data are formatted to compute the error 
matrix and accuracy statistics. These calculations are explained 
in Part III of this book (Sect. 5 in chapter “Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover Maps"). 


Available tools 


* SCP plugin 
* AcATaMa plugin 
* Vectorial menu 
Geoprocessing tools 
Buffer 
* QuickMapServices plugin 
* Google Earth Engine Data Catalog plugin 


The Semi-Automatic Classification Plugin (SCP) and the 
AcATaMa plugin have a module for the collection of ref- 
erence data. AcATaMa provides a multi-view interface that 
allows spatial units to be added and revised in an orderly 
manner. However, spatial units can only be one pixel in size. 
For its part, the process for collecting the reference data 
using SCP is very similar to the process that would be fol- 
lowed if just QGIS tools were used. Notwithstanding, as 
SCP uses a unique data format (.scp), it is quite complicated 
to add other types of data or to use information from other 
platforms. 

Both plugins have valuable tools that assist in the capture 
of reference data. However, as we intend to use larger spatial 
units than one pixel and wish to keep the installation of new 
interfaces and formats to a minimum, we will only use the 
basic QGIS (Buffer) tools, and other data services such as the 
Google Earth Engine Data catalog and QuickMapsServices. 


Materials 


Centroids of sample sites—Marqués de Comillas (the point 
vector layer RandomSample.shp created in the previous 
exercise that contain the centres of the sample sites) 


Execution 
Step 1 


Before data collection can begin, the size and shape of the 
spatial unit must be established, i.e. the area over which we 
will be making the comparison between the thematic map 
values and the reference values. The minimum mappable 
area of the thematic map to be used in this exercise is 1 ha, 
and it is generally recommended that the spatial unit should 
be of a similar size. Accordingly, in this exercise, we will be 
using square polygons of 1 ha as the spatial unit. 


Sample Data for Thematic Accuracy Assessment in QGIS 


The point layer containing the centroids (Centroids of 
sample sites—Marqués de Comillas) will be used to create 
the spatial assessment units. To form square polygons cen- 
tred on each of the points in the point layer, use first the 
Buffer tool in the Geoprocessing Tools menu. The input will 
be the point layer, and the distance value depends on the 
desired size of the square. In this case, 50 m. Change the 
End cap style to “Square” and leave the rest of the param- 
eters unchanged (Fig. 4). 

The newly created layer will have two attributes: the id 
and the value of the thematic class (inherited from the pre- 
vious exercise). To avoid bias in the photointerpretation 
decision-making process, we advise hiding the class column 
(the value of the thematic class taken from the LUC map). 
To hide a column in the attribute table without deleting it 
definitively, right-click on the area of the attribute table 
headers, select the option Organize Columns, and then select 
the columns to hide (Fig. 5). 


Fig. 4 Exercise 2. Step 1. Buffer Q Buffer 


Parameters Log 


Input layer 


RandomSample [EPSG:32615] 


meters M 
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Step 2 


Ideally, to identify the land-cover type of each sampling unit, it 
would be necessary to overlay them on high spatial resolution 
images with the same (or similar) date as the images used in the 
mapping. If such data are available, photointerpretation of the 
spatial units can proceed directly. However, acquiring 
high-resolution images to verify extensive areas could be 
expensive. In this regard, sometimes the resources are limited, 
which restricts the use of this source of imagery. 

In the following steps, we propose a partial solution to 
this problem based on the combined use of image servers 
(Google Earth, Bing, ESRI) with high spatial and temporal 
resolution. However, in these servers it is impossible to 
identify and select scenes according to their acquisition 
dates. One way to estimate the dates of these data is to 
compare them with images with higher temporal resolution, 
which have a known acquisition time, and for which a longer 


ZINIF 
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C:/reference data/RandomSample Buffer.shp 
V Open output file after running algorithm 


Run as Batch Process... 
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Q Organize Table columns = m x 


| © [Action Widget] 


Cancel 


Fig. 5 Exercise 2. Step 1. How to hide columns in the attribute table 


historical record is available, such as Sentinel or Landsat 
images. For this purpose, we will install two plugins with 
which we can access the high spatial resolution image ser- 
vers of Google, Bing and ESRI (QuickMapServices) and 
Sentinel, Landsat, Aster and other images (Google Earth 
Engine Data Catalog). To see how to install these plugins in 
QGIS, see Chapter “About This Book”. 


Step 3 


Once QuickMapServices is installed, open the plugin and 
select Setting. Then select the More Services tab and click on 
the Get contributed pack. To add images with high spatial 
resolution to the QGIS Project, in the Web option in the 
main menu select QuickMapService, then Google and finally 
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Google Satellite. After selecting these options, the Google 
Satellite images become available in the Layers menu. 


Step 4 


Add Sentinel-2 images from the Google Earth Engine Data 
Catalog plugin. The plugin requires to define the product 
type, date and cloud cover percentage (Fig. 6). The images 
can be saved temporarily or in permanent files. The images 
to be added should be dated as close as possible to the date 
of the evaluated map. 


Step 5 


To facilitate the collection of reference data, we suggest 
creating multiple windows to display images with different 
dates or resolutions, a good way to work when assessing 
land-cover change maps. 

In the Layers panel, select the Image and vector layer 
(Centroids of sample sites—Marqués de Comillas) that will 
be added to the second window, click on “Manage Maps 
Themes” button (represented as an eye) and select “Add 
Theme". Name the theme “Image 1”. Then go to “View” in 
the main menu, and select “New Map View”. This will 
create a new display window. Enter the new window and do 
the following: (a) Select the layers set to display (Image 1) 
by clicking on the on “Manage Maps Themes”; (b) Syn- 
chronise the windows by selecting the “view settings” tool, 
and then click on the “Synchronise scale” option. The 
example in Fig. 7 shows a Google server image (2) and a 
true-colour Sentinel image (1). 


Step 6 


To capture the reference data, the “Centroids of sample sites 
—Marqués de Comillas" vector layer must be edited and a 
new field (integer type) must be added. We suggest naming 
it "Refer data". 


Fig. 6 Exercise 2. Step 4. , 
Google Earth Engine Data Q Search Google Earth Engine Data Catalog X 
Catalog plugin 
Dataset COPERNICUS/S2_SR v | The start date (indusive): 2019-03-01  * 
Bands True color (TCI_R-TCI_G-TCI_B) v | The end date (exdusive): | 2019-04-30 * 
Sentinel-2 MSI loud " ^ 
Level-2A orthorectified atmospherically corrected surface n= Lm ln M 
reflectance. 
Maximum number ofimages: 5 |$ 
Output Folder Temporar Add layer(s) to canvas: V 


Aceptar | Cancelar 


Sample Data for Thematic Accuracy Assessment in QGIS 


Fig. 7 Exercise 2. Step 5. Map 1 
Synchronising windows 
Bop ww; 


~ 


Magnifier | 100% 


Step 7 


To make reference data collection easier, we recommend 
displaying the Attribute Table as a form and anchoring it to 
the main window, displaying only the selected data. To do 
this, open the Attribute Table, select the “Dock Attribute 
Table” icon, select the “switch to form view” button and 
then “show Selected Feature” (Fig. 8). 


Step 8 


If you have completed Steps 1—8 successfully, you can now 
start photointerpreting high-resolution satellite images. The 
exercise involves identifying the predominant land-cover 
type in each sampling unit and recording the corresponding 
code in the "Refer data" column of the attribute table 
(Fig. 9). The meanings of the codes are described in the 
auxiliary data distributed with the Marqués de 
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$| Rotation |0.0° $| V Render @ EPSG:32615 
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Comillas LUC map, available at https://doi.org/10.5281/ 
zenodo.5418318. 

Photointerpreting all the spatial units of the sample can be 
a lengthy process, so we suggest that you try to photoint- 
erpret at least 10 to 20 spatial units and then compare your 
results with the “Photo-interpreted reference dataset—Mar- 
qués de Comillas 2019”, a reference dataset was prepared by 
the authors. It is available, together with all book’s data, at 
https://doi.org/10.5281/zenodo.5418318. For more informa- 
tion, see Chapter “About This Book”. 


Results and Comments 


The result of this exercise should be a shapefile with an attribute 
table in which the columns class (map class code) and refer-data 
(photointerpreted class code) are filled in, as shown in Fig. 10. 
From the attribute table, you can now calculate the error matrix 
and the map accuracy statistics, as is done later in Sect. 5 in 


94 M. Á. Castillo-Santiago et al. 


Buffered =: Features Totak 749, Filtered: 1, Selected: 1 ex 
WB muaxoldoH; X TYZze€pPEÉEBt£YE s Fe 
123i "££ " | Update Fitered Update Selected 
C. Expression $ 
© 35420 d 35420 a 
dass 261.0 a 
Refer Data Nu $ 
5 “ $ 


> 
Show Selected Features [Em 


RandomSample Buffer :: Features Total: 747, Filtered: 0, Selected: 0 B 


4\%B *à D&SB8B5.,;guETEZpPS» 
123 id Y-|*£ v || Update Filtered | Update Selected 
C. Expression : 
id 5940 a 
refer_data 261 € F 
dass 290 a 
1/0. \% 5 


T Show Selected Features, 
Ofea di 755568.3,1781663.2 QS 21:2164 + @ r 100% $ 3 0.0° > V Render @epsc:32615 Q 


Fig. 9 Exercise 2. Step 8. Photointerpretation over images of different resolutions displayed on syncronised windows 


Sample Data for Thematic Accuracy Assessment in QGIS 


Fig. 10 Results from Exercise 2. 
Table of attributes with the map 
codes and the photointerpreted 


codes P 


Ll 


OBJECTID 


4 


"Y Show All Features A : 


chapter *Metrics Based on a Cross-Tabulation Matrix to Vali- 
date Land Use Cover Maps" of this book. 

If the images used in the validation were acquired at a 
different time than the one for which the LUC map repre- 
sents the covers on earth, this must be taken into account 
when assigning labels. This date mismatch may increase the 
uncertainty of the reference data, a situation that should be 
avoided. 

It is worth remembering that in the absence of high spatial 
resolution imagery, medium resolution imagery, such as 
Landsat or Sentinel, can provide sufficient information to 
validate maps, especially small-scale maps. 

Although the spatial assessment unit used in this exercise 
is widely used and recommended, it may contain several 
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Q RandomSample_Buffer :: Features Total: 746, Filter... — (m) X 

SENOS YE® >» 

id class refer_data = 
26183 290 290 
355 46734 290 290 
366 34636 290 290 
365 65008 290 290 
368 59416 290 290 
367 54968 290 290 
362 89323 290 290 
361 48907 290 290 
364 66843 290 290 
363 64829 290 290 
342 82082 290 290 
341 59964 290 290 
344 94266 290 290 
343 32136 290 290 
338 39511 290 290 
337 67817 290 290 
340 9785 290 290 


land-cover types. This means that clear rules should be 
established when deciding the category to which the unit 
should be allocated in these circumstances. 
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Abstract 


In this chapter, we describe the fundamental principles 
and the normal procedure followed when cross-tabulating 
two datasets. Cross-Tabulation analysis (Sect. 1) is usu- 
ally the first step in the validation of Land Use Cover 
(LUC) data. It compares two datasets to observe their 
spatial relationship, i.e. their degree of spatial (dis) 
agreement. Results are usually displayed in the form of 
maps, tables and other statistical measures. Multiple- 
Resolution Cross-Tabulation (Sect. 2) compares two 
raster datasets at multiple spatial resolutions. Basic 
Cross-Tabulation can compare raster and vector data, 
while Multiple-Resolution Cross-Tabulation only works 
with raster data, which is what we use in the exercises 
provided as examples. In the exercises, raster data were 
obtained from vector data previously rasterized at differ- 
ent spatial resolutions. As a reference we use LUC maps, 
although ground points could also be used as reference 
data for these analyses. Examples of Cross-Tabulation 
analyses at one and multiple resolutions are presented for 
four different cases: to validate single LUC maps, to 
validate the soft maps produced by a model, to validate a 
simulation exercise and to validate and study land change 
in a series of LUC maps. In the example exercises, we 
used CORINE and SIOSE maps from the Asturias Central 
Area database, as well as maps from the modelling 
exercises carried out with this database. In the Chapters 
“Metrics Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps" and “Pontius Jr. Methods Based 
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on à Cross-Tabulation Matrix to Validate Land Use Cover 
Maps", we focus on specific analyses that can be carried 
out on the basis of Cross-Tabulation analyses, such as 
Land Use Cover Changes (LUCC) Budget or Quantity 
and Allocation disagreement. These help unleash the full 
potential of Cross-Tabulation analysis. 


Keywords 


Cross-Tabulation *« Multiple-Resolution « Land Use 
Cover data * Validation 


1 Basic Cross-Tabulation 
Description 


Cross-Tabulation is a primary analysis that crosses two 
datasets, either raster or vector, to analyse their spatial 
relation. This analysis combines the datasets in spatial terms. 
It produces a map or table that shows how the values of one 
dataset spatially relate with the values in the other, thereby 
informing us as to whether the two datasets share the same 
values at a given location and, if not, with which other 
values they have established a relation. 


Utility 


Exercises 


1. To validate a map against reference data/map 

2. To validate soft maps produced by the model against a reference 
map 

3. To validate a simulation against a reference map 

4. 'To validate a series of maps with two or more time points 


Starting with a map and some reference data, we can use 


Cross-Tabulation to determine to what extent the map we 
want to validate agrees with the reference data. In this way 
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we can compare the success of a LUC classification exercise 
or a LUCC modelling exercise against reference data. We 
can also assess how uncertain a map is with regard to the 
data used as a reference. Cross-Tabulation can also be used 
to study the LUC changes between pairs of maps at two or 
more different points in time, or to validate a chronological 
series of maps, as it can detect unusual or abnormal changes, 
which could be due to technical errors. 

The Cross-Tabulation matrix provides users with a lot of 
information from the maps in one single analysis. However, 
in order to take advantage of the full potential of this anal- 
ysis, it is important for them to understand what all this 
information means. This is what we will be explaining in this 
chapter. 

The results of Cross-Tabulation can then be used to make 
further analyses and to extract other metrics that allow us to 
take full advantage of this basic analysis. These methods 
(e.g. LUCC budget, Quantity & Allocation disagreement, 
the Figure of Merit, Intensity Analysis) (see Sects. 2, 3, 4 
and 6 in Chapter "Pontius Jr. Methods Based on a Cross- 
Tabulation Matrix to Validate Land Use Cover Maps") make 
it easier for users to interpret the results. However, they also 
require many further analyses and are therefore more 
time-consuming. We will now provide an overview of some 
relevant examples: 


e Hagen-Zanker (2009) used a well-known 
Cross-Tabulation matrix to improve the fuzzy Kappa 
statistic (see Sect. 3 in Chapter “Metrics Based on a 
Cross-Tabulation Matrix to Validate Land Use Cover 
Maps"). 

e Alcamo et al. (2011) used the Cross-Tabulation function 
with potential maps from a land use change model. 

e Mas et al. (2014) and Paegelow et al. (2014) used 
Cross-Tabulation in different ways to provide useful 
information to help them assess land change model 
robustness. 

e Krüger and Lakes (2015) calculated a disagreement index 
from the Cross-Tabulation matrix used in LUCC mod- 
elling exercises. 

e Pontius (2018) created an Excel spreadsheet that performs 
a range of automatic analyses from the Cross-Tabulation 
matrix. 


The maps to be compared or assessed may be in either 
raster or vector format. For those in raster, we can use both 
hard and soft maps, such as suitability, transition potential 
and probabilities maps. 


M. T. Camacho Olmedo and D. García-Álvarez 


QGIS Exercises 


The methods and techniques presented in Chapter “Pontius 
Jr. Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps" (e.g. LUCC Budget, Intensity 
Analysis, Quantity and Allocation disagreement...) are 
based on this basic Cross-Tabulation analysis. In this 
chapter, we will therefore only be describing the funda- 
mental principles and the normal procedure followed when 
performing a Cross-Tabulation between two datasets. 


Available tools 


* Processing Toolbox 
SAGA 
Image analysis 
Confusion matrix (two grids) 
Confusion matrix (polygons/grid) 
Raster analysis 
Cross-classification and tabulation 
* Processing Toolbox 
GRASS 
Raster 
T.CTOSS 
* Semi-Automatic Classification Plugin 
Tab: Postprocessing 
Section: Cross classification 
Section: Accuracy 
Section: Land cover change 


QGIS includes many tools for cross-tabulating spatial 
data through the associated GRASS and SAGA models. The 
“Semi-Automatic Classification Plugin" also includes tools 
to cross-tabulate datasets for different purposes. 

Table 1 includes a review of the available Cross- 
Tabulation tools in QGIS. It provides information of the 
input and output parameters in each tool. Although the r. 
kappa function also cross-tabulates two raster maps to obtain 
the Kappa index, we will not be analysing it in this chapter. 
Those interested in using this tool should refer to the Kappa 
indices, Sect. 3 in Chapter "Metrics Based on a Cross- 
Tabulation Matrix to Validate Land Use Cover Maps". 

The associated R software can also be used to 
cross-tabulate pairs of maps. This is done using the crosstab 
function, which is part of the “raster” package.! As QGIS 
already provides many tools for carrying out this analysis, 
we will not be covering the implementation of this R func- 
tion in QGIS here. 

Of all the tools available in QGIS, the one we will be 
recommending and using in this book is the “Semi-Automatic 


! https://cran.r-project.org/web/packages/raster/raster.pdf. 
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Classification Plugin", which proved to be the most efficient 
and stable of all those assessed. 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To validate the CORINE 2011 Land Use map, taking the 
SIOSE 2011 Land Use map as a reference. 


Materials 


SIOSE Land Use Map Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must have the same extent, spatial resolution, 
projection and classification legend. If the maps have dif- 
ferent classification legends, the user must reclassify the 
maps in such a way as to unify the two legends. 


Execution 


Step 1 


Open the “Semi-Automatic Classification Plugin" and select 
the “Postprocessing” tab from the sidebar. Then click on 
Accuracy and select the required parameters: raster to assess 
(CORINE map) and reference raster (SIOSE map) (Fig. 1). 


EP Semi-Automatic Classification Plugin 


3 ^ 
-* Band set — 


& Basic tool Input 


X Downloa 
Select the classification to assess C11 
[- Preproce: 
Select the reference vector or raster S11 


> Band pro 
G Postproc 


Band calc 


$Ñ Batch Run 


& Settings 


[7 About DEDE 


Fig. 1 Exercise 1. Step 1. Semi-automatic classification plugin 
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Results and Comments 


Once the function has been executed, QGIS creates an 
output raster that gives each pixel a code. This code iden- 
tifies every single possible combination of values between 
the two input rasters. The meaning of each code is presented 
in a table in CSV format, which is stored in the same folder 
as the raster. This information is also displayed in the 
"output" window of the “Semi-Automatic Classification 
Plugin" (Fig. 2). 

If we analyse the first matrix shown in the “output” win- 
dow (ErrMatrixCode/Reference/Classified/PixelSum), it will 
help us understand the meaning of the codes in the raster. The 
“ErrMatrixCode” is the number that identifies each pixel in 
the new raster. "Reference" is the code for the category on the 
reference map (i.e. SIOSE Land Use Map). “Classified” is the 
code for the category on the compared map (i.e. CORINE 
Land Use Map), and “PixelSum” refers to the number of 
pixels for each combination in the new raster. 

The ErrMatrixCode 1 identifies 234,164 pixels (Pix- 
elSum) in category 0 in SIOSE (Reference) and 0 in COR- 
INE (Classified). The codes for combinations in which the 
reference and the classified categories are the same (e.g. 0, 0) 
mean agreement, while those in which the reference and the 
classified categories are different (e.g. 0, 1) mean disagree- 
ment. Code 2 is therefore a disagreement area because the 
pixel is classified as 0 in SIOSE and as 1 in CORINE. 

If we symbolize the obtained raster in such a way that all 
the codes that refer to combinations of the same classes 
(1, 15, 29, 43, 57, 71, 85, 99, 113, 127, 141, 155) are labelled 
as agreement and all the codes that refer to combinations of 
different classes are labelled as disagreement, we can obtain a 


x E x 


fi Accuracv | I Land cover chanae | FF Classification reoort | Pli Cross classification | & Class sianature || « > 


Accuracy assessment = 


Vector fiek M 


RUN $ 
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Fig. 2 Results from Exercise 1 displayed in the “output” window of the Semi-Automatic Classification Plugin 


map like the one presented in Fig. 3. Code 169 is not rep- 
resented on this new map because it refers to pixels that are 
background (category 12) in both SIOSE and CORINE. 

Although the map in Fig. 3 illustrates the general pattern of 
disagreement areas, it does not provide much information 
about the particular characteristics of the disagreement 
between the two datasets. For a better understanding of how 
similar/different CORINE is from SIOSE, other maps must be 
drawn up. 

With the obtained raster, we can for example represent 
where the urban fabric of CORINE (2) confuses with other 
classes in SIOSE. We can even detail which classes of 
SIOSE are affected. 

To do so, we must first identify the codes (ErrMa- 
trixCode) for the combinations we are looking for, i.e. pixels 
which are urban fabric in CORINE (Classified is 2) and 
which belong to any other category in SIOSE (Reference is 
not 2). These are codes 3, 16, 42, 55, 68, 81, 94, 107, 120, 
133, 146 and 159. We can also represent the pixels that both 
CORINE and SIOSE label as urban fabric (code 29). The 
resulting map can be seen in Fig. 4. 


This map shows the city of Oviedo and its immediate 
surrounding area. Most of the city is identified as urban 
fabric in both SIOSE and CORINE. However, CORINE also 
labels as urban fabric many small patches that SIOSE 
identifies, for example, as industrial or commercial areas or 
artificial green urban areas. This disagreement is to be 
expected given the different Minimum Mapping Units 
(MMU) and Minimum Mapping Widths (MMW) of both 
databases. The MMU used in CORINE is 25 ha, whereas in 
SIOSE it is only 0.5-2 ha. The result is that many of the 
small patches inside the city that SIOSE identifies as other 
classes are classified as urban fabric in CORINE because of 
the scale at which this map was made. CORINE offers a 
much more generalized picture of the landscape to be 
mapped out. This means that when validated against SIOSE, 
numerous errors emerge due to the level of generalization. 

In addition to the map, the accuracy analysis of the 
“Semi-Automatic Classification Plugin” also generates two 
error/Cross-Tabulation matrixes, one in cells and the other in 
square meters (area proportions). The matrix in cells (Fig. 5) 
shows the number of pixels for each combination. For 
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Bl Agreement 
Ill Disagreement 


0 10000 m 
L — 


Fig. 3 Result from Exercise 1. Map showing areas of agreement and 
disagreement between CORINE and SIOSE maps 


example, if we look at the combination 0—0, we see that 
there are 234,164 pixels that have the same value (0) in 
SIOSE and CORINE. In other words, there are 234,164 
pixels classified as agricultural areas on both maps. 
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The area-based error matrix gives us the same informa- 
tion (the proportion of the total area of the raster represented 
by each combination), but in different units. Using the 
example above, the matrix shows that combination 0-0 
covers a fraction of 0.2535/1 of the map, i.e. 25.35% of its 
pixels (Fig. 6). 

An analysis of the two tables (area-based error matrix and 
error matrix pixel count) offers us a detailed picture of how 
the categories on one map relate with the categories on the 
other. This highlights the degree of agreement between the 
reference map and the one we are trying to validate. In both 
tables, the combinations in which there is agreement can be 
seen on a diagonal line running across the table. All com- 
binations outside this diagonal mean disagreement (Table 2). 

If we look at urban fabric, of a total of 28,110 pixels 
labelled as urban fabric in CORINE (Total column on the 
right), 19,455 are also labelled as urban fabric in SIOSE. 
That is, almost 7096 of the pixels identified as urban fabric 
by CORINE are also considered urban fabric in SIOSE. In 
the other 3096, CORINE mostly confuses urban fabric with 
industrial and commercial areas (category 3, 2244 confused 
pixels), artificial green urban areas (category 9, 1643 con- 
fused pixels) and road and rail networks (category 6, 1216 
confused pixels). 

These results are due to the greater degree of general- 
ization when mapping CORINE, as explained above. On the 
basis of these results and taking SIOSE as a reference, we 
can conclude that CORINE maps urban fabric correctly and 
can be considered a valid map for our exercises. 

Users can also carry out more complex analyses with these 
matrixes using the CSV file generated by the tool. In this way 


Il Urban fabric (Agreement) 

.. | Agricultural areas 

Il Vegetation areas 

ISl Industrial and commercial areas 
Ill Mineral extraction sites 

| | Dump sites 

E Road and rail networks 

Wl Port areas 

Il Airports 

[3 Artificial green urban areas 

|__| Open spaces with little or no vegetation 
|} Water bodies 

Ill Background 


0 1900 m 


L~ |] 


Fig. 4 Result from Exercise 1. Map showing areas of agreement and disagreement between CORINE and SIOSE maps for the CORINE category 
"urban fabric'. The map specifies with which categories of SIOSE the urban fabric category of CORINE is confussed 
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> ERROR MATRIX (pixel count) 

» Reference 
V. Classified 0 1 2 3 
0 234164 25054 5531 1871 
1 16853 220884 360 329 
2 2438 553 19455 2244 
3 1233 1003 1500 13330 
4 309 167 78 75 
5 53 34 23 57 
6 544 545 311 533 
ri 0 9 14 117 
8 10 465 0 18 
9 439 539 222 189 
10 45 734 15 24 
11 13 73 5 25 
12 0 0 0 0 
Total 256101 250060 27514 18812 


Fig. 5 Result from Exercise 1 displayed in the “output” window of the Semi-Automatic Classification Plugin: Error matrix in pixels 


> AREA BASED ERROR MATRIX 

> Reference 
V_Classified 0 1 2 3 
0 0.2535 0.0271 0.0060 0.0020 
1 0.0182 0.2392 0.0004 0.0004 
2 0.0026 0.0006 0.0211 0.0024 
3 0.0013 0.0011 0.0016 0.0144 
4 0.0003 0.0002 0.0001 0.0001 
5 0.0001 0.0000 0.0000 0.0001 
6 0.0006 0.0006 0.0003 0.0006 
7 0.0000 0.0000 0.0000 0.0001 
8 0.0000 0.0005 0.0000 0.0000 
9 0.0005 0.0006 0.0002 0.0002 
10 0.0000 0.0008 0.0000 0.0000 
11 0.0000 0.0001 0.0000 0.0000 
12 0.0000 0.0000 0.0000 0.0000 
Total 0.2773 0.2708 0.0298 0.0204 
Area 640252250 625149756 68784973 47029982 
SE 0.0003 0.0002 0.0001 0.0001 
SE area 580258 542766 293597 248955 
95% CI area 1137306 1063821 575449 487951 
PA [95] 91.4342 88.3324 70.7095 70.8590 
UA [%] 85.6793 90.5351 69.2102 64.6742 
Kappa hat 0.8018 0.8702 0.6826 0.6394 


Fig. 6 Result from Exercise 1 displayed in the “output” window of the Semi-Automatic Classification Plugin: Area based error matrix 
(proportions) 


Table 2 Traditional scheme ofa — y Classified References 
Cross-Tabulation matrix, z 
differentiating which cells 0 1 2 3 
indicate agreement between the 0 Agreement Disagreement Disagreement Disagreement 
compared maps and which cels : : : 
aps : 1 Disagreement Agreement Disagreement Disagreement 
indicate disagreement 
2 Disagreement Disagreement Agreement Disagreement 
3 Disagreement Disagreement Disagreement Agreement 
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the matrixes can be imported in spreadsheet format with 
software such as Excel or OpenOffice Calc. We can then 
calculate the agreement and disagreement percentages for the 
whole raster or for each of the categories under consideration, 
as we did manually for the urban fabric above. 

The error matrixes also provide useful statistical measures 
(Fig. 6), such as the standard error (SE), confidence interval 
(CD, producer's accuracy (PA), user's accuracy (UA), 
overall accuracy (in 96) and Kappa (see Sect. 3 in Chapter 
"Metrics Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps"). 


Exercise 2. To validate soft maps produced by the 
model against a reference map 


Aim 


To find out whether the urban fabric soft map produced by 
our model agrees with the urban fabric areas of the reference 
map for the year of the simulation. 


Materials 


CORINE Land Use Map Asturias Central Area 2011 
Urban fabric suitability map — CORINE model 


Requisites 


The two maps must have the same extent, spatial resolution 
and projection. The soft map must be categorical. The Land 
Use map must only contain information about the category 
being assessed. For a proper validation, the reference map 
must refer to the same date on which the landscape was 
simulated. 


Execution 
Step 1 


Only discrete or categorical maps can be cross-tabulated. As 
the soft map we want to validate is continuous (continuous 
values from 0.1 to 1), the first step must be to convert it into a 
categorical map, using the Reclassify by table function (Pro- 
cessing toolbox » Raster analysis » Reclassify by table). 
After opening this tool, we select the map we want to 
reclassify (Urban fabric suitability map) and fill in the 
"Reclassification table" with the new values that will be 
replacing the old ones in the raster (Fig. 7). In this case, we 
are going to reclassify the values on our suitability map 
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(0—1) into four categories, from low to high suitability. The 
new categories will be 1 (suitability 0—0.25), 2 (0.25—0.50), 
3 (0.50—0.75) and 4 (0.75-1). 


Step 2 


Given that our objective is to compare the suitability values 
for urban fabric in the model with the areas classified as 
urban fabric on the 2011 map, we must ignore all the other 
categories on the Land Use Cover map. We must therefore 
obtain a binary map from the initial CORINE map. In this 
binary map, 1 will mean the category being evaluated (urban 
fabric) and 0 all the others. 

To obtain this binary map, we repeat the same process as 
in Step 1. In this case, we reclassify the CORINE map, 
assigning a value of 1 to urban fabric (code 2 in the original 
map) and a value of 0 to the other categories (codes 0, 1, 3, 
4, 5, 6, 7, 8, 9, 10, 11 and 12) (Fig. 8). 


Step 3 


Once we have obtained the two maps, we then carry out 
Cross-Tabulation using the *Semi-Automatic Classification 
Plugin". We click on the “Postprocessing” tab and select the 
Cross classification option. 

We then select the required parameters. In “Select the 
classification" we choose the reference Land Use Cover map 
obtained after reclassification (Step 2). In “Select the refer- 
ence vector or raster” we choose the soft map obtained after 
reclassification (Step 1) (Fig. 9). 


Results and Comments 


Once the function has been executed, QGIS creates a raster 
and a CSV file with all the results of the Cross-Tabulation. 
These are also displayed in the “Output” window (Fig. 10). 

The first table provides information about the meaning of 
each code in the new raster. Pixels with value 2 refer to areas 
that are urban fabric (Classification is 1) and have a suit- 
ability of less than 0.2 (Reference category is 1). This 
combination occurs in just 2 pixels (PixelSum), which rep- 
resent an area of 5000 m? on the map (Area [metre?]). 

The second table gives an overview of the possible 
combinations on the two maps and the area, in square 
meters, covered by these combinations. This shows that the 
areas that are not urban fabric (Classification is 0.0) and have 
a suitability of below 0.25 (Reference 1) occupy 
2,312,499 m. 

From all the possible combinations, we can see that most 
of the pixels that are urban fabric on the reference map fit 
with the areas with the highest suitability to become urban 
fabric (26,137 pixels, 65,342,474 m°). There are relatively 
few urban fabric pixels with a suitability of between 0.5 and 
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Parameters 


Raster layer 


Log 


=" c Suitability Factor Urban Fabric [EPSG:32630] v 


Band number 

Band 1 (Gray) 
Redassification table 

Fixed table (4x3) 

> Advanced parameters 
Redassified raster 
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‘opbox/Manual_QGIS/Crosstab/Suitability_Urban_fabric_CORINE_reciassified.rst 


[7] Open output file after running algorithm 


Fig. 7 Exercise 2. Step 1. Reclassify by table 


Q Fixed table x 
Minimum Maximum Value Add Row 
1 0 1 0 Remove Row(s) 
21 2 1 Remove All 
32 12 0 [ «x | 
Cancel 


Fig. 8 Exercise 2. Step 2. Table required for the Reclassify by Table tool 


0.75 (1971 pixels, 4,927,498 m?) and an insignificant 
number with a suitability of less than 0.5. 

These results indicate that our suitability map has been 
validated. In other words, the high suitability values on the 
soft map correspond with urban fabric areas on the reference 
map. For their part, the low suitability values correspond to 
areas where there is no urban fabric on the map. This means 
that when we use this map in our simulation, it will help us 


to correctly identify those areas that can become urban in the 
future and those that cannot. 

Other more sophisticated tools, such as the ROC curve 
and the Difference in Potential (see Sects. 2 and 3 in Chapter 
“Validation of Soft Maps Produced by a Land Use Cover 
Change Model"), can be used to complement this analysis 
and offer the user a full overview of the validity of their 
potential maps. 
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B® Semi-Automatic Classification Plugin - [m x 


use pets vae 


Run 


Fig. 9 Exercise 2. Step 3. Semi-Automatic Classification plugin 


Reference Area [metre ^2] 

1.0 i 2312499.0978074283 
1.0 A 4999.998049313358 
3.0 T 578867274.1621801 
3.0 A 4927498.077598315 
4.0 y 962127124.6381476 
4.0 A 65342474.50745163 


3.0 4.0 Total 
578867274 962127124 1543306897 
4927498 65342474 70274972 
583794772 1027469599 1613581870 


Fig. 10 Results from Exercise 2 displayed in the “output” window of the Semi-Automatic Classification Plugin 


Materials 

Exercise 3. To validate a simulation against a ref- 

erence map Simulation CORINE Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


] Requisites 
Aim 
The two maps must have the same extent, spatial resolution, 
projection and legend. For a proper validation, the reference 
date must refer to the date on which the landscape was 
simulated. 


To validate a simulation for the year 2011, which we 
obtained through our LUCC modelling exercise with 
CORINE maps, against a CORINE reference map for the 
year 2011. 
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Execution 
Step 1 


Open the “Semi-Automatic Classification Plugin", click on 
the “Postprocessing” tab and select the section Accuracy. 
Then, select the required parameters: raster to assess (Sim- 
ulation) and reference raster (CORINE reference map) 
(Fig. 11). 


Results and Comments 


When we execute the function, QGIS creates an output raster 
showing the combination of classes between the two input 
maps. The function generates three tables in the “output 
window”, which are also stored in CSV format in the same 
folder as the raster. They specify the meaning of each code 
in the new raster. They also include a couple of 
error/Cross-Tabulation matrixes, in cells and in square 
meters (proportional quantities) (Fig. 12). Statistical mea- 
sures such as standard error (SE), confidence interval (CD), 
producer’s accuracy (PA), user’s accuracy (UA), overall 
accuracy (%) and Kappa are also provided in the tables. 

If we symbolize the raster and focus on the information in 
the Cross-Tabulation matrix of most interest for assessing 
our simulation, we can understand the errors we made in our 
modelling exercise in greater detail. 

In our exercise we only actively modelled two categories: 
urban fabric and industrial areas. In the raster we can identify 
the simulated areas that show agreement (or disagreement) 
with the reference map for each of these two categories. To 
do this, the first step is to identify the code for the 
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combinations involving the two categories being considered: 
urban fabric (2) and industrial and commercial areas (3). 

The combination codes for urban fabric are 3, 16, 29, 42, 
81, and 120, while code 29 represents the areas that were 
simulated as urban fabric (Classified is 2) and also appear as 
urban fabric on the reference map (Reference is 2). The 
combination codes for industrial and commercial areas are 4, 
17, 30, 43, and 82, while code 43 represents the pixels that 
are industrial and commercial areas in both the simulation 
and the reference map. 

If we symbolize the raster obtained using these codes in 
terms of agreement (codes 29 and 43) and disagreement (all 
the other codes mentioned above), we can visualize the 
pattern of error in our simulations compared to the map we 
use as a reference (Fig. 13). 

Most of the simulated areas agree with the reference 
map. Disagreement can only be observed in a few cases. 
However, this conclusion may be misleading. Most of the 
agreement refers to areas that were already urban fabric or 
industrial and commercial areas, i.e. areas that were correctly 
simulated as permanence. 

Simulating permanence for artificial surfaces is very easy. 
A high rate of success is expected in all cases. If we focus on 
the areas that actually changed during the simulation period in 
relation to the reference map and those that were simulated as 
change, we can detect a higher proportion of errors. However, 
this cannot be detected on our map. In order to focus on these 
errors, we should only cross-tabulate the changes in the sim- 
ulation with respect to the initial map (CORINE 2005) and the 
changes in the reference map (CORINE 2012) with respect to 
the initial map (CORINE 2005). Using this method, the new 
map and the Cross-Tabulation table would only assess those 


EP Semi-Automatic Classification Plugin - o x 
= A 
= Band set fi Accuracv | 98 Land cover chanae | FF Classification report | Ii Cross classification | & Class siar 4 > 
& Basic tool Input 
$ Downloa | Accuracy assessment E 
Select the classification to assess Simulation C11 v|io 
[- Preproce: 
Select the reference vector or raster C11 v||o 
*» Band pro Vector fielc Y 
Q Postproc 
[Run eee 
Band calc RUN * 
Ñ Batch < > 
Output 
& Settings mé 
x| 


Fig. 11 Exercise 3. Step 1. Semi-Automatic Classification plugin 
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> ERROR MATRIX (pixel count) 


» Referencia 
V. Classified 0 1 
0 271263 374 
1 579 243071 
2 621 60 
3 585 33 
4 211 327 
5 0 0 
6 44 111 
7 0 0 
8 0 0 
9 0 0 
10 0 0 
11 0 0 
12 0 0 
Total 273303 243976 
» AREA BASED ERROR MATRIX 
» Referencia 
V. Classified 0 1 
0 0.2937 0.0004 
1 0.0006 0.2632 
2 0.0007 0.0001 
3 0.0006 0.0000 
4 0.0002 0.0004 
5 0.0000 0.0000 
6 0.0000 0.0001 
7 0.0000 0.0000 
8 0.0000 0.0000 
9 0.0000 0.0000 
10 0.0000 0.0000 
11 0.0000 0.0000 
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3 4 5 

364 148 11 

76 157 109 

28 0 0 
20034 0 0 

21 2504 0 

0 0 754 

0 0 0 

0 0 0 

0 0 0 

0 0 0 

0 0 0 

88 0 0 

0 0 0 
20611 2809 874 

3 4 5 
0.0004 0.0002 0.0000 
0.0001 0.0002 0.0001 
0.0000 0.0000 0.0000 
0.0217 0.0000 0.0000 
0.0000 0.0027 0.0000 
0.0000 0.0000 0.0008 
0.0000 0.0000 0.0000 
0.0000 0.0000 0.0000 
0.0000 0.0000 0.0000 
0.0000 0.0000 0.0000 
0.0000 0.0000 0.0000 
0.0001 0.0000 0.0000 


Fig. 12 Results from Exercise 3 displayed in the “output” window of the Semi-Automatic Classification Plugin 
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[E] Agreement Urban Fabric 

Bill Disagreement Urban Fabric 

[E] Agreement Industrial and Commercial Areas 
Bill Disagreement Industrial and Commercial Areas 


Fig. 13 Result from Exercise 3. Map showing areas of agreement and disagreement between our simulation and the reference map for the two 


categories actively simulated: urban fabric, industrial and commercial areas 
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areas that changed between the two dates, so removing 
unchanged areas from the analysis. 

An analysis of the error/Cross-Tabulation matrixes leads 
to similar conclusions. For urban fabric, out of a total of 
28,183 pixels labelled as such on the simulation map (Total 
column on the right), 27,402 pixels were also classified as 
urban fabric on the reference map. A total of 621 pixels 
confuse with agricultural areas, 60 with vegetation areas and 
100 with other categories on the reference map. Most of the 
confusion is therefore with categories where one would 
expect new urban fabric to develop. 

Once again, whereas most of the agreement refers to areas 
that were already urban fabric in the past and were correctly 
simulated as persistence, confusion seems to refer above all 
to areas that were not correctly simulated. That is, agricul- 
tural and vegetation areas where new urban fabric was 
simulated but which, according to the reference map, did not 
actually undergo any change. We therefore need to repeat 
the analysis, focusing only on the areas that actually change 
SO as to assess the success of our simulation more 
effectively. 

Other tools, such as the Figure of Merit (see Sect. 4 in 
Chapter “Pontius Jr. Methods Based on a Cross-Tabulation 
Matrix to Validate Land Use Cover Maps"), can also be 
useful to help validate the simulation and overcome some of 
the limitations we have encountered. 


Exercise 4. To validate a series of maps with two or 
more time points 


Aim 


To study the land use change between two CORINE maps at 
two different time points: 2005 and 2011. 


Materials 


CORINE Land Use Map Asturias Central Area 2005 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster and must have the same extent, 
spatial resolution, projection and classification legend. If the 
maps have different classification legends, the user must 
reclassify the maps in such a way as to unify the two 
legends. The maps must refer to two different points in time. 
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Execution 
Step 1 


The first step is to obtain a raster for the whole study area, 
showing the areas that changed during the study period and 
those that remained the same. 

To get this map, open the “Semi-Automatic Classification 
Plugin", click on the “Postprocessing” tab and then select 
Land cover change. Then, complete the required parameters, 
selecting the older map as the reference classification 
(CORINE 2005) and the more recent one as the new clas- 
sification (CORINE 2011). Mark the "Report unchanged 
pixels" option. 


Step 2 


To obtain a map that only shows the areas that changed 
during the study period, we must repeat the same operation, 
this time leaving the "Report unchanged pixels" option 
unmarked (Fig. 14). 


Results and Comments 


After executing Steps 1 and 2, QGIS creates two output 
rasters, one showing changes and permanent areas (Fig. 15) 
and the other showing just the changes between the two 
maps (Fig. 16). Each raster will identify each possible 
combination between categories or pixel values with a single 
unique code. 

The function also generates a table for each map in the 
"output" window and stored in CSV format. This table 
shows each possible combination and the code with which it 
is represented in the output rasters (Fig. 17). All the com- 
binations are included in the table, even if no pixels actually 
undergo this change. 

Both the rasters and the table can be used to understand 
the changes in our study area. The table shows those that 
took place during the study period (Table 2) and includes 
changes from agricultural areas (category 0 in CORINE 
2005), vegetation areas (category 1), urban fabric (2), 
industrial and commercial areas (3), mineral extraction sites 
(4), road and rail networks (6) and water bodies (11). 

Of the various different transitions of agricultural areas, 
the one to urban fabric (from category 0 in 2005 to category 
2 in 2011) is the most important with a total of 751 pixels. 
As regards the transitions in vegetation areas (category 1), 
the most common was the change from vegetation areas to 
agricultural areas (from category 1 in 2005 to category 0 in 
2011), with a total of 588 pixels. 
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Fig. 14 Exercise 4. Step 2. Semi-Automatic Classification plugin 


Fig. 15 Result from Exercise 4. Raster displaying the areas that are the 
same in the two maps compared, that is, the areas of permanence in the 


time series 
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Fig. 16 Result from Exercise 4. Raster displaying the areas that are 
different in the two maps compared, that is, the areas of change in the 
time series 
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[D Semi-Automatic Classification Plugin as Oo x 


Æ Band set Si Accuracy MLand cover chanae | fF Classification revort | IBI Cross classificat « | 


E Basic tools ^ Input 


| 
& Download p : 
E ChangeCode ^ ReferenceClass NewClass PixelSum a 
& Preprocessii 1 0.0 1.0 374 
| 2 0.0 2.0 751 
> Band proce. 3 0.0 3.0 503 
4 0.0 40 148 
5 0.0 5.0 11 
Postproces 6 0.0 6.0 301 
| 7 0.0 7.0 0 
& Band calc 8 0.0 8.0 0 
| 9 0.0 9.0 132 
Ñ Batch 10 0.0 10.0 0 
| 11 0.0 11.0 0 
| 12 0.0 12.0 0 
& Settings 13 10 0.0 588 
| 14 1.0 2.0 61 
D About 15 1.0 3.0 82 
| 16 1.0 4.0 157 
| 17 1.0 5.0 109 
Euer 18 10 6.0 225 
| 19 1.0 70 0 
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25 2.0 0.0 21 
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Support the Sq 32 20 8.0 0 
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36 2.0 12.0 0 
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Fig. 17 Results from Exercise 4 displayed in the “output” window of the Semi-Automatic Classification Plugin? 


? ReferenceClass and NewClass columns may appear swiched due to 
the use of a different version of the “Semi-Automatic Classification 
Plugin”. 
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Table 3 Result from Exercise 4. Table showing the transitions 
detected between the two maps compared and their size 


Code CORINE 2005 CORINE 2011 Quantity of changes 
1 0 1 374 
2 0 2 751 
3 0 3 503 
4 0 4 148 
5 0 D 11 
6 0 6 301 
9 0 9 132 
13 1 0 588 
14 1 2 61 
15 1 3 82 
16 1 4 157 
17 1 3 109 
18 1 6 225 
22 1 10 180 
25 2 0 21 
26 2 1 22 
27 2 3 26 
33 2 9 4 
37 3 0 51 
39 3 2 11 
49 4 0 211 
50 4 1 327 
51 4 2 89 
52 4 3 21 
73 6 0 44 
74 6 1 111 
136 11 3 88 
140 11 7 657 


Code column may appear with other values if using a different version 
of the “Semi-Automatic Classification Plugin” 


This change in pixels (Table 3) can be translated into a 
change in area, by multiplying each pixel by the area it 
covers. The spatial resolution of our raster is 50 m, so the 
calculation is easy: a square with a 50 m side covers a 
surface area of 2500 m°. This is the area of each pixel. 
Therefore, the transition from agricultural areas (0) to urban 
fabric (2) which took place in 751 pixels affected an area of 
1,877,500 m°. 
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Most of the change in our area was between agricultural 
and vegetation areas and vice versa and from agricultural 
and vegetation areas to artificial surfaces. However, there 
were also various other interesting transitions, such as the 
conversion of water bodies into port areas (from category 11 
to category 7), which affected a total of 657 pixels. This was 
due to the construction of a dock in Gijón in the north of our 
study area. 

By symbolizing the raster of changes (Fig. 18), we can 
gain a spatial perspective of what changed. To obtain this 
map, we must group the changes together according to the 
new land use. Codes 13, 25, 37, 49 and 73 will show the 
areas that changed to agricultural areas. Codes 1, 26, 50 and 
74 will show changes to vegetation areas. Codes 2, 14, 39 
and 51 will show changes to urban fabric. Codes 3, 15, 27, 
52 and 136 will show changes to industrial and commercial 
areas. Codes 4 and 16 will show changes to mineral 
extraction sites. Codes 5 and 17 will show changes to dump 
sites. Codes 6 and 18 will show changes to road and rail 
networks. Code 140 will show changes to port areas. Codes 
9 and 33 will show changes to artificial green urban areas. 
Finally, Code 22 will show changes to open spaces with 
little or no vegetation. 

In the composition of the map in Fig. 18, we also added 
CORINE 2006 as the base layer, with an opacity of 10%, to 
enable us to interpret the changes on the map better. 

The map shows the changes for the example area of Gijón. 
In the north, we can observe the new dock built in the port 
area. Apart from the port, most of the growth in industrial 
land took place in the south of the city. The same is true for 
urban fabric, with the construction of a new residential 
development in Roces. As can be seen on the map, this new 
residential area is cut off from the existing urban fabric of the 
city. There is a highway running between the two. 

The results of this analysis can also be useful to validate a 
chronological series of maps. When interpreting the changes, 
it can help detect unrealistic changes that may be due to 
errors in the input data. We can also detect changes in the 
boundaries of the study area which cannot be fully repre- 
sented on the maps because the study area has been clipped. 

Other tools and techniques, such as LUCC budget or 
Quantity and Allocation disagreement, can also help char- 
acterize real changes in the study area and detect areas where 
no changes have taken place, despite being marked as 
change areas on the maps. In this way, these techniques can 
provide useful, complementary information on this question. 
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Fig. 18 Result from Exercise 4. Map showing areas of change between the two maps compared, displayed over the map for the oldest year 


2 Multiple-Resolution Cross-Tabulation 
Description 


Multiple-Resolution Cross-Tabulation is based on the same 
technique as basic Cross-Tabulation (see previous section). 
It crosses two raster datasets at a minimum of two different 
spatial resolutions: the original resolution and a coarser one. 
However, users can compare the dataset at as many different 
resolutions as they deem fit. These must always be coarser 
than the original spatial resolution. 

The concept of spatial resolution refers to the level of 
spatial detail available in the spatial data. It applies to data in 
raster format, where the spatial resolution is defined by the 
pixel size. This means that, unlike basic Cross-Tabulation, 
this analysis can only be performed with raster data. 


Utility 


Exercises 


1. To validate a map against reference data/map 

2. To validate soft maps produced by the model against a reference 
map 

3. To validate a simulation against a reference map 


This technique aims to control the multiscale uncertainty 
of a validation exercise, which is not considered in basic 
Cross-Tabulation analyses. It can also be used to evaluate 
the uncertainty of a LUC classification exercise, a LUC map 
or a LUCC modelling exercise against reference data. 


Maps that show a lot of disagreement at detailed scales 
can refer to the same information at coarser scales. This 
technique can therefore be used to discover at which spatial 
resolution a map is considered least uncertain according to 
the information provided by a reference map. 

This analysis can be used as a complement to fuzzy logic 
tools (Fritz and See 2005), which evaluate the agreement 
between maps by considering spatial near-hits. A near-hit 
occurs when two pixels that share the same value are not in 
the same spatial position, but close to each other. 

Multiple-Resolution Cross-Tabulation can only be carried 
out with raster data. However, we can make the comparison 
with either hard- or soft-classified raster maps, such as 
suitability, transition potential or probabilities maps. In the 
last case, we must always reclassify the soft-classified raster 
maps in a set of categories. It is not possible to cross-tabulate 
rasters with a continuous range of values. 

As in the case of basic Cross-Tabulation, if we want to 
explore the full potential of the results of these analyses, we 
can use other complementary metrics such as Land Use Cover 
Change budget (LUCC budget, see Sect. 2), Quantity and 
Allocation disagreement or the Figure of Merit (see Sects. 3 
and 4 in Chapter “Pontius Jr. Methods Based on a Cross- 
Tabulation Matrix to Validate Land Use Cover Maps"). 

In addition to the basic Multiple-Resolution Cross- 
Tabulation presented in this section, some more sophisticated 
variants have been proposed by other authors. These include: 


e Costanza (1989), who proposed a method to determine 
the goodness of fit between model output and spatial 
and/or time series data based on the idea that the 
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measurements at one resolution are not sufficient to 
describe more complex patterns. In his method, an 
expanding window is used to gradually degrade the res- 
olution of the data, establishing, among the lack of fit, 
situations of "registration", "resolution" and residual 
components. 

e Kok et al. (2001), who proposed a multiscale land use 
change modelling procedure, applied at five spatial res- 
olutions, and demonstrated that results improve strongly 
as spatial resolution decreases. 

e Pontius and Cheuk (2006) proposed a method for com- 
puting a Cross-Tabulation matrix at multiple scales, 
focusing on  soft-classified pixels. This Multiple- 
Resolution method resolves difficulties due to traditional 
Cross-Tabulation approaches and fuzzy methods, 
proposing a Composite operator. 


QGIS Exercises 


Available tools 


* Processing Toolbox 
GRASS 
Raster 
r.resample 
* Processing Toolbox 
GDAL 
Raster projections 
Warp (reproject) 
* Processing Toolbox 
SAGA 
Raster tools 
Resampling 
* Layer 
Save As... 


QGIS does not include a tool to cross-tabulate maps at 
multiple resolutions. To carry out this analysis, it is therefore 
necessary to combine raster resampling tools with the basic 
Cross-Tabulation tools. For detailed information of the tools 
available in QGIS for performing Cross-Tabulation, please 
refer to Sect. 1. 

Various different tools can be used to resample raster 
maps in QGIS. The GRASS module provides a tool (r.re- 
sample) for resampling the raster according to the Nearest 
Neighbour method. The GDAL module provides a tool to 
reproject rasters (Warp (reproject) that also enables 
resampling through different methods, including the Nearest 
Neighbour. For its part, the SAGA toolbox provides a tool 
for resampling rasters with similar options. In addition, the 
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QGIS interface allows the user to resample maps by making 
a copy of a displayed map via the option "Save raster layer 
as..." (Layer » Save as). 

For categorical maps such as Land Use Cover maps, two 
resampling strategies are usually applied: Nearest Neighbour 
and Majority Rule. We decided to apply Nearest Neighbour 
because this is the method that best preserves the landscape 
composition and configuration or in other words, the pro- 
portions of the different categories and their patterns. 

The four resampling tools available in QGIS are all 
equally valid. In this case we decided to use the tool that 
becomes available when making a copy of an existing raster 
(Save as...) because of its simplicity and efficiency. Never- 
theless, users must be aware that the resampled rasters will 
vary slightly depending on the method chosen, and are 
therefore not fully comparable. Once a method or tool has 
been selected, all the resampling procedures must be per- 
formed using this same method or tool. 


Exercise 1. To validate a map against reference 
data/map 


Aim 

To validate the CORINE 2011 Land Use map, taking the 
SIOSE 2011 Land Use map as the reference and determining 
the resolution at which the maps show most agreement. 


Materials 


SIOSE Land Use Vector Map Asturias Central Area 2011 
CORINE Land Use Vector Map Asturias Central Area 2011 


Requisites 

The two maps must have the same extent, projection and 
classification legend. If the maps have different classification 
legends, the user must reclassify the maps in such a way as 
to unify the two legends. 

Execution 

Step 1 

Given that to carry out Cross-Tabulation at multiple reso- 


lutions we need to have maps in raster format, the first thing 
we have to do is rasterize our vector maps. If you would like 
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Parameters Log 
Input layer ^ 
L= S11 [EPSG:32630] v lâs] 3 
Selected features only 
Field to use for a burn-in value [optional] 
123Metro m 
A fixed value to burn [optional] 
0,000000 €i 5 
Output raster size units 
Georeferenced units v 
Width/Horizontal resolution 
[25,000000 a E] 
Height/Vertical resolution 
[25,000000 a i] 


Output extent (xmin, xmax, ymin, ymax) 


253104.48261493043,293117. 19231460453, 4783400.979174812,4841017.00007537 [EPSG: 32630] 


Assign a specified nodata value to output bands [optional] 


[12,000000 a E] 
> Advanced parameters 
Rasterized 


C:/Users/David/Dropbox/Manual_QGIS/Crosstab_multiple_resolutions/S11_25m.rst 


Open output file after running algorithm 
Fig. 19 Exercise 1. Step 1. Rasterize (Vector to Raster) 


to perform this analysis by resampling original raster maps, 
please refer to Exercise 2 Step 1. 

We are going to convert our original vector file to raster at 
four different spatial resolutions: 25, 50, 75 and 100 m. Our 
analysis will be based on the same four spatial resolutions. 

To rasterize vector data, we use the Rasterize (Vector to 
raster) tool. Once inside this tool, we begin by indicating the 
vector layer we want to rasterize (SIOSE 2011 map). Then, 
we go to “Field to use for burn-in value [optional]” where 
we indicate the field in the attribute table of the vector layer 
that will give the raster the pixel values (Metro) (Fig. 19). 

We must also set the spatial resolution for the raster we 
want to create. To do this, we must first define the units for 
the spatial resolution in the “Output raster size unit” option 
(Georeferenced Units). Then, we choose the spatial resolu- 
tion or pixel size through the “Width/Horizontal resolution” 


(25) and “Height/Vertical resolution” options (25). We must 
also specify the extent of the raster that will be created in the 
option “Output extent (xmin, xmax, ymin, ymax)”. We are 
going to use the extent of the layer we are rasterizing (SIOSE 
2011) through the submenu on the right (Use layer extent...). 

The final stage is to assign a value to the background, i.e. 
the pixels that are not covered by any polygon in the vector 
file. Given that the vector already has values from 0 to 11, 
we will define the background with code 12. We do this via 
the option “Pre-initiate the output image with value [op- 
tional]", available under the “Advanced parameters” options 
(Fig. 20). 

Our background value (12) will also be the nodata value 
of our raster. We can assign a nodata value for the raster we 
are going to create using the option "Assign a specified 
nodata value to output bands [optional]" (Fig. 19). 
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wv Advanced parameters 
Additional creation options [optional] 
Profile Default 
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Name 


Value 


dae Validate 
Output data type 
Float32 


Pre-initialize the output image with value [optional] 


|12,000000 


[C] Invert rasterization 


Fig. 20 Exercise 1. Step 1. Advanced parameters of the Rasterize (Vector to Raster) tool 


Step 2 


Once we have finished the first rasterization, we must repeat 
the same procedure for the other three spatial resolutions that 
we need for the SIOSE dataset. Then, we must repeat the 
whole workflow for the CORINE map. Once all these tasks 
have been completed, we will have 8 different maps (4 
SIOSE and 4 CORINE) at 4 different spatial resolutions (25, 
50, 75 and 100 m). 


Step 3 


Once all the maps have been created, we can start the 
Cross-Tabulation. To do this, open the “Semi-Automatic 
Classification Plugin", click on the “Postprocessing” tab and 
select Cross Classification. Then, select the required 
parameters: raster to assess (CORINE map 25 m) and ref- 
erence raster (SIOSE map 25 m) (Fig. 21). 


Step 4 
After the first execution, repeat this process with the other 


pair of maps (one for CORINE and one for SIOSE) at dif- 
ferent spatial resolutions. 


Results and Comments 


Once we have executed the function four times, QGIS will 
create an output map for each execution with the combined 
classes and an error/Cross-Tabulation matrix. These will be 
stored in the folder we selected earlier when executing the 
tool. Matrixes are also displayed in the “output” window. 
For a detailed description of each of these results, please 
refer to the Sect. 1. 

If we compare the results of each of the error matrixes, we 
can see that there are few differences between them. Error 
matrixes show the area in square meters covered by each 
possible combination between classes. The combination that 
covers most area is always the agreement between agricul- 
tural areas: pixels that are O (agricultural areas) in both the 
validated (CORINE) and the reference (SIOSE) maps. At a 
spatial resolution of 25 m, these areas occupy 585,267,500 
m^; at 50 m, 585,225,000 m^; at 75 m, 585,815,625 m^; and 
at 100 m, 584,660,000 m?. The differences are therefore 
very small. 

A similar pattern can be observed if we look at the rest of 
the combinations. This means that at all the spatial resolu- 
tions there are very similar levels of agreement and dis- 
agreement between the classes on the two maps (CORINE 
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Exercise 1. Step 2. Semi-Automatic Classification plugin 


Fig. 21 


and SIOSE). We can therefore conclude that the spatial 
resolution selected to make the analysis has no substantial 
effect on the results. 

That means that the areas classified differently on the two 
maps are not due to small details drawn on one map that do 
not appear on the other. Disagreement is not the result of 
isolated pixels on one map that are not classified in the same 
category on the other. If this were true, the agreement 
between the two maps should be higher at coarser resolu- 
tions because they are more generalized, so ruling out minor 
details. 

In conclusion, it would seem that the differences between 
the two maps are structural. In other words, they are not 
caused by the spatial resolution or level of detail of the maps, 
and instead result from the fact that each map represents a 
different reality on the ground. If we generalize both maps 
and rule out all small details, both maps show a similar level 
of agreement. Notwithstanding this, we must always 
remember that most of the areas in both maps agree, as 
confirmed in the Sect. 1. 

When compared with SIOSE, CORINE can be consid- 
ered a valid map because the agreement between the two is 
very high. The differences between them are the same 
regardless of the spatial resolution employed to make the 
analysis, at least within the resolution range we used (from 
25 to 100 m). Thus, although the differences between SIOSE 
and CORINE are the result of their different scale and 
Minimum Mapping Unit, they cannot be eliminated simply 
by generalizing the maps using coarser spatial resolutions. In 
fact, their agreements and disagreements remain the same, 
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which suggests that the different scale of production intro- 
duces important structural differences in the way the two 
maps draw the ground land uses and land covers. 


Exercise 2. To validate soft maps produced by the 
model against a reference map 


Aim 


To evaluate to what extent the urban fabric suitability map of 
our model agrees with the urban fabric areas of the reference 
map for the year of the simulation at multiple spatial reso- 
lutions, determining the resolution at which there is most 
agreement. 


Materials 


CORINE Land Use Map Asturias Central Area 2011 
Urban fabric suitability map—CORINE model 


Requisites 


The two maps must have the same extent, spatial resolution 
and projection. The soft map must be a categorical map. The 
Land Use map must only contain information about the 
category being assessed. For a proper validation, the refer- 
ence map must refer to the same date as the simulation. 
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Output mode (8) Raw data ©) Rendered image 


Format Idrisi Raster A.1 
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Fig. 22 Exercise 2. Step 3. Save Raster Layer as... tool 


Execution 
Step 1 


We begin by converting our soft map into a categorical one to 
comply with the requirements of the Cross-Tabulation tool. 
This is done using the Reclassify by table function (Pro- 
cessing toolbox > Raster analysis > Reclassify by table). 

There are no standard criteria for the reclassification of 
soft maps and users can apply whatever thresholds they 
think best. In this case, we will use the same thresholds we 
used in Exercise 2 of the Sect. 1. We will therefore reclassify 
the map into four new categories: 1 (suitability 0—0.25), 2 
(0.25—0.50), 3 (0.50—0.75) and 4 (0.75-1). 


Step 2 


As stated in the requisites, we will cross-tabulate the reclas- 
sified soft map with a map that only shows the Land Use 


Cover category of interest, i.e. urban fabric. To this end, we 
must extract the urban fabric areas from the LUC map 
(CORINE) using the same function as in Step 1 (Reclassify 
by table). In the reclassification, we will assign a value of 1 to 
urban fabric (code 2 in the original map) and a value of 0 to 
the other categories (codes 0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 
12). For a detailed explanation of how to carry out these first 
two steps, readers are referred to Exercise 2 of the Sect. 1. 


Step 3 


Once we have the two maps, we can then resample them at 
different spatial resolutions to carry out the Multiple- 
ResolutionCross-Tabulation. In our case, as the original 
pixel size is 50 m, we will resample our maps at 75, 100, 
125 and 150 m using the Save As...tool. In this tool, we need 
to indicate the name of the map we are going to resample 
(the reclassified suitability map of urban fabric) and the 
spatial resolution at which we will resample the maps 
(Fig. 22), in our case, 75 m. 
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Fig. 23 Exercise 2. Step 5. Semi-Automatic Classification plugin 


Step 4 


After resampling the map, we must repeat the same proce- 
dure for the other resolutions (100, 125 and 150 m). Then, 
we do the same for the urban fabric areas map. By the end 
we should have 8 maps (4 SIOSE and 4 CORINE) at 4 
different spatial resolutions (75, 100, 125 and 150 m). 


Step 5 


Once we have obtained all the maps we need, we can then 
carry out the Cross-Tabulation exercise using the Cross 
classification tool from the “Semi-Automatic Classification 
Plugin". Once inside the tool, we must indicate the two 
rasters that we want to cross-tabulate: the soft map (Select 
the classification) and the land use map for the category of 
interest (Select the reference vector or raster) (Fig. 23). 


Step 6 


After we do this for the maps at the original resolution 
(50 m), we repeat the process at the other 4 spatial resolu- 
tions (75, 100, 125 and 150 m). 


Results and Comments 


After executing the function for each pair of maps at each 
spatial resolution, the tool produces (for each spatial reso- 
lution) an output map with the combination and two 
matrixes detailing how the values of both maps 
cross-tabulate. These are stored in the folder we selected and 


are also displayed on the screen (Output tab). For a detailed 
description of each of these results, please refer to the 
Sect. 1. 

“The “Cross Matrix" is the most interesting of all these 
results in that it provides us with all the information we need 
for our analysis. It details how much of the area for each 
category in the reclassified suitability map falls inside areas 
that are urban fabric in our reference maps (Tables 4, 5, 6, 7 
and 8). 

For the analysis at a spatial resolution of 50 m, there are 
4999 m? of low suitability (suitability below 0.25) that 
cross-tabulate with areas that are urban fabric in the 


Table 4 Result from Exercise 2. Table showing the corresponde 
between the urban fabric category in CORINE and the different groups 
of suitability values for urban fabric in the map at 50m of spatial 
resolution 


50m 0 (Not urban fabric) 1 (Urban fabric) 
1 (0-0.25) 2,312,499 4,999 

3 (0.50-075) 578,867,274 4,927,498 

4 (0.75-1) 962,127,124 65,342,474 


Table 5 Result from Exercise 2. Table showing the corresponde 
between the urban fabric category in CORINE and the different groups 
of suitability values for urban fabric in the map at 75 m of spatial 
resolution 


75m 0 (Not urban fabric) 1 (Urban fabric) 
1 (0-0.25) 2,271,136 11,245 

3 (0.50-075) 578,752,547 4,919,739 

4 (0.75-1) 963,211,926 65,351,009 
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Table 6 Result from Exercise 2. Table showing the corresponde 
between the urban fabric category in CORINE and the different groups 
of suitability values for urban fabric in the map at 100 m of spatial 
resolution 


100 m 0 (Not urban fabric) 1 (Urban fabric) 
1 (0-0.25) 1,576,662 738,436 

3 (0.50-075) 578,405,616 5,208,973 

4 (0.75-1) 961,414,853 64,373,732 


Table 7 Result from Exercise 2. Table showing the corresponde 
between the urban fabric category in CORINE and the different groups 
of suitability values for urban fabric in the map at 125 m of spatial 
resolution 


125 m 0 (Not urban fabric) 1 (Urban fabric) 
1 (0-0.25) 2,410,302 15,651 

3 (0.50-075) 579,223,768 5,008,419 

4 (0.75-1) 961,507,072 65,125,110 


Table 8 Result from Exercise 2. Table showing the corresponde 
between the urban fabric category in CORINE and the different groups 
of suitability values for urban fabric in the map at 150 m of spatial 
resolution 


150 m 0 (Not urban fabric) 1 (Urban fabric) 
1 (0-0.25) 2,296,991 0 

3 (0.50-075) 580,283,055 4,841,697 

4 (0.75-1) 960,525,192 65,216,537 


reference LUC map. If we consider that each pixel represents 
an area of 2500 m? (50 m x 50 m), this means that only 2 
pixels of urban fabric cross-tabulate with areas of low suit- 
ability on the suitability map. 1971 pixels with medium to 
high suitability (0.5—0.75) cross-tabulate with areas that are 
urban fabric. Finally, most of the urban fabric pixels 
cross-tabulate with areas with the highest suitability (0.75— 
1): this combination is represented by 26,137 pixels. These 
data show that there is a positive correlation between suit- 
ability and the presence of urban fabric. We can therefore 
conclude that suitability is a good driver for our model. 

Varying the spatial resolution of the analysis did not lead 
to any major differences in the correlation between the 
suitability map and the urban fabric areas in the reference 
maps. At the five spatial resolutions assessed, most of the 
pixels fell within the highest suitability category (0.75-1). 

The dissimilarities between the analyses at different res- 
olutions were very small. At 75 m, just two pixels fell within 
the areas of lowest suitability (11,245 mô). At 100 m, there 
were a lot more: 74 pixels (738,436 mô). At 125 m there was 
just 1 pixel (15,651 m?), and at 150 m, no pixels at all 
(0 mô). Similar behaviour can be observed for the other two 
categories of suitability at all five resolutions. 
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This indicates that the suitability map for urban fabric in 
our modelling exercise is correct. It positively correlates 
with those areas that are urban fabric in our reference map, 
so helping us to identify the areas in which new urban fabric 
is most likely to appear. However, no conclusions can be 
drawn regarding the best spatial resolution at which to carry 
out the modelling exercise. As the explanatory power of the 
suitability maps is very similar at all the spatial resolutions 
assessed, the decision as to which spatial resolution would 
be best for our modelling exercise should be based on other 
factors, such as how realistic the pattern looks or what the 
minimum level of detail might be for the model to be useful 
for stakeholders and users. 

This analysis could be complemented with more sophis- 
ticated tools like the ROC curve and the Difference in 
Potential (see Sects. 2 and 3 in Chapter “Validation of Soft 
Maps Produced by a Land Use Cover Change Model”). 
These tools also provide information about how well a 
model soft map simulates a category of interest, such as 
urban fabric. 


Exercise 3. To validate a simulation against a ref- 
erence map 


Aim 


To validate a simulation for the year 2011 against a reference 
map for the same year at multiple spatial resolutions, 
determining the resolution at which both maps show the best 
agreement. 


Materials 


Simulation CORINE Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 

The two maps must have the same extent, spatial resolution, 
projection and legend. For proper validation, the reference 
date must refer to the date on which the landscape was 
simulated. 

Execution 

Step 1 

For Multiple-Resolution Cross-Tabulation, we need first to 


resample the original rasters (50 m) at other spatial resolu- 
tions. In this case, we will resample our simulation at 100, 
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150 and 200 m, according to the procedure for the Save As... 
tool set out in the previous exercise (Exercise 2, Execution - 
Step 2). Once inside the tool, we fill in the required 
parameters: name of the raster to be sampled (Simulation 
CORINE) and spatial resolution (100 m). 


Step 2 


Once we have resampled the first map, we then repeat the 
procedure for the other spatial resolutions (150 and 200 m) 
and for the reference map. By the end, we should have 8 
maps (4 simulations and 4 reference maps) at 4 spatial res- 
olutions (50, 100, 150 and 200 m). 


Step 3 


With all these resampled maps, we can then carry out the 
Cross-Tabulation exercise at multiple resolutions. To do this, 
open the “Semi-Automatic Classification Plugin", click on 
the “Postprocessing” tab and select Accuracy. Fill in the 
required parameters: raster to assess (Simulation CORINE 
11 map at 50 m) and reference raster (CORINE 11 map at 
50 m) (Fig. 24). 


Step 4 


Repeat the same procedure for the other pairs of maps at 
100, 150 and 200 m. 
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Results and Comments 


After this function has been executed for each spatial reso- 
lution, QGIS will create an output map, a couple of matrixes 
and some statistical measures. All the tables and statistics 
can be consulted in the “output window” and all the results 
will be saved in the folder we selected earlier. For a detailed 
description of each of these results, please refer to the 
Sect. 1. 

The analysis of the matrixes at the different spatial reso- 
lutions shows no important differences between resolutions, 
and very similar results in all cases. In general, there is a 
high level of agreement between the simulation and the 
reference map, as studied above in the Sect. 1 when con- 
ducting the analysis at the original resolution of the mod- 
elling exercise. 

If we take Overall Accuracy as a summary metric 
describing the similarity between the two maps, we can see 
that similarity is very high in all cases (Table 9). Only the 
exercise at 100 m shows a lower agreement rate. This may 
be due to multiple causes, but it does indicate that coarsening 
the spatial resolution of the simulation does not ensure 
higher levels of agreement between the simulated landscape 
and the reference landscape. 

We must also bear in mind the limitations for this exer- 
cise mentioned in the Sect. 1. Validating a simulation by 
cross-tabulating the simulated exercise with a reference map 
may be misleading. Most of the areas in both maps agree 
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Fig. 24 Exercise 3. Step 3. Semi-Automatic Classification plugin 


Basic and Multiple-Resolution Cross-Tabulation ... 


Table 9 Results from Exercise 3. Overall accuracies of the simulation, 


when assessed against a reference map, at four spatial resolutions: 50, 
100, 150 and 200 m 


50 m 
99.3 


100 m 
93.6 


150m 
99.3 


200m 


Overall accuracy (96) 99.3 


because most of the areas in the simulated landscape remain 
the same during the modelling period. 

The best way to validate the changes modelled in our 
exercise is to focus exclusively on the simulated changes and 
on a map of reference showing the changes on the ground. In 
this case, the Multiple-Resolution exercise could provide 
very interesting insights, as agreement between simulated 
and reference changes may be higher at coarser spatial 
resolutions. 
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Jean-Francois Mas, David García-Álvarez, Martin Paegelow, 
Roberto Domínguez-Vera, and Miguel Ángel Castillo-Santiago 


Abstract 


The overlaying of two map layers is a standard GIS 
procedure. As we saw in the previous chapter, it enables us 
to compute the intersection between two feature classes 
and cross-tabulate either the area or the pixel count of the 
intersecting features depending on whether raster or vector 
data are being used. Cross-tabulation can be used to 
evaluate different topics depending on the nature of the 
input data. In this chapter, cross-tabulation is used to 
assess land cover changes, the spatial agreement between 
maps and map accuracy. In Sect. l, Land use/cover 
changes (LUCC) are quantified by comparing two LUC 
maps, computing different indices of change and creating 
a change matrix. In Sect. 2, we used various metrics to 
evaluate the spatial agreement between two maps. This 
procedure was applied to compare a LUC map with a 
reference map, a simulated LUC map with a reference map 
and a simulated LUCC map with a reference map of 
changes. Section 3 introduces the Kappa indices, which 
allow us to assess the agreement between two datasets, 
given the agreement expected by random coincidence. We 
used the indices to compare observed or simulated maps 
with a reference map. In Sect. 4 we evaluate the agreement 
between maps at a global level (the entire map) by 
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focusing on a specific feature such as a smaller area or a 
particular category (stratum level). Finally, in Sect. 5, the 
cross-tabulation between a map and reference sample data 
is used to assess the thematic accuracy of the map by 
calculating various different accuracy indices. We present 
examples of analyses based on cross-tabulation for four 
different cases: To validate a series of maps with two or 
more time points, to validate a map against a reference 
map, to validate a simulation against a reference map and 
to validate simulated changes against a reference map of 
changes. In the example exercises, we use CORINE and 
SIOSE maps from the Asturias Central Area and Ariége 
Valley datasets and maps of the Marqués de Comillas 
region of south-eastern Mexico (MarquesLUC dataset). 
The cross-tabulation techniques proposed by Robert 
Gilmore Pontius Jr. are applied in Chapter “Pontius Jr. 
Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps". 


Keywords 


Cross-tabulation * Changes * Spatial agreement * 
Accuracy 


1 Change Statistics 
Description 


Land use/cover change (LUCC) can be quantified by com- 
paring two maps or two classified images that represent land 
cover at two different dates. 

Absolute change (AC) is the difference in the area cov- 
ered by a category (category area) between two dates and is 
usually expressed in hectares or square kilometres. 


AC = Ay — Aj 
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where A, and A» are the category areas in question at dates 1 
and 2, respectively. 

AC can be divided by the number of years between the 
two dates to obtain the average annual change area over the 
study period. 

Relative change (RC) is obtained by normalizing the 
absolute change value by the category area at date 1. 


RC = (A5 — Ai)/A 


This formula expresses the proportion of the category 
area that changed over the study period. 

Other indices of LUCC include rates of change. The most 
popular rate of change is the annual rate of deforestation 
proposed by the FAO (1995). This indicator is based on the 
compound interest law. It expresses the proportion of the 
category area that changes in one year. 


- A> 1/(t2—t1) i 
= 


An alternative equation, also based on the compound 
interest law, was proposed by Puyravaud (2003). 


1 l A» 
—€——— n — 
(t-t) Ai 


r= 


Both formulae give similar results except when LUCC is 
very high, in which case r is significantly higher than t 
(Puyravaud, 2003). 

All the change indices presented above indicate net 
change, which results from the balance after gross losses 
have been subtracted from gross gains. For instance, a given 
forest category could show an absolute change of —2 ha, 
which could be erroneously interpreted as very little change, 
but in fact is the result of two opposing processes: the 
deforestation of 202 ha compensated by the reforestation of 
200 ha. A more detailed analysis of change dynamics can be 
obtained by cross-tabulating the two maps at two different 
dates and drawing up a change matrix. The change matrix is 
a cross-tabulated table indicating the area covered by each 
change (or permanence) between a category at date 1 and 
another category at date 2. Many change indices can be 
obtained from this matrix (see, for example, Sect. 2). 


Utility 
Exercises 


1. To validate a series of maps with two or more time points 


Indices of change are widely used to assess LUCC. Nor- 
malized indices, such as rates of change, enable us to com- 
pare the rate of change between regions of different sizes. 
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QGIS Exercise 


Available tools 


* Processing R provider Plugin 
Change. Statistics.rsx R script 


The indices of change proposed in this document are based 
on the area statistics for the two maps. These could be 
efficiently computed using a spreadsheet program. However, 
we suggest using a simple R script using the QGIS Pro- 
cessing R provider plugin. The script generates a table 
containing the absolute change (AC) area, the relative 
change (RC) area (both in hectares), the rates of change 
based on FAO and Puyravaud (2003) and the change matrix. 


Exercise 1. To validate a series of maps with two or 
more time points 


Aim 


To assess LUCC in the Ariege study area using the CORINE 
Land Use maps dated 2000 and 2018. 


Materials 


CORINE Land Cover Map Val d'Ariége 2000 
CORINE Land Cover Map Val d'Ariége 2018 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


Execution 


If necessary, install the Processing R provider plugin and 
download the R script Change Statistics.rsx into the R 
scripts folder (processing/rscripts). For more information, 
see Chapter *About This Book". 


Step 1 
Then, execute the script and fill in the required parameters 


(names and dates of the two maps and the output table) as 
shown in Fig. 1. 
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Change Statistics x 


Parameters 
Map date1 

$* CLC 2000 [EPSG:32631] 
Map date2 

$* CLC 2018 [EPSG:32631] 
date1 

2000.000000 
date2 

2018.000000 
Stat 


Log 


/home/jfmas/pCloudDrive/proyectos/INCERTIMAPS/Results/change statistics.csv 


V Open output file after running algorithm 
ChangeMatrix 


/home/jfmas/pCloudDrive/proyectos/INCERTIMAPS/Results/change matrix.csv 


V Open output file after running algorithm 
R Console Output 


0% 


Run as Batch Process... 


Fig. 1 Exercise 1. Step 1. Change Statistics R script 


Table 1 Results from Exercise 1 
displayed in the “output” window 
of the Change Statistics R script. 
Change indices 


Area date 1 

Area date 2 
Absolute change (ha) 
Relative change (96) 


Annual rate of change t 
(FAO) 


Annual rate of change r 


The script generates two tables in CSV format: a table 
showing the change indices (Table 1) and the change matrix 
(Table 2). 


Results and Comments 
The two land covers with the most significant absolute 


change are Categories 1 (built-up) and 2 (agriculture). 
During the period 2000-2018, the built-up area is increased 


b 
Change 
Statistics 
: home/ifm 35/.loCa 
y re/oG T 3 
prof Gefau 
E |t cessin pts 
Change Stati x 
a: 
« ic 
Oclose | @Run | 
0 1 2 3 4 5 6 
74,437 3350 54,558 39,491 12,729 931 76 
74,437 5190 52,571 40,344 11,973 943 115 
0 1840 —1987 853 —156 12 39 
0 54.93 —3.64 2.16 —5.94 1.29 51.32 
0 2.46 —0.21 0.12 —0.34 0.07 2.33 
0 2.43 —0.21 0.12 —0.34 0.07 2.30 


by 1840 ha, and agriculture lost 1987 ha. The built-up area 
increased by over 50%. The rates of change resulting from 
the two equations are very similar. The two categories with 
the largest rates are built-up (Category 1) and water (Cate- 
gory 6) areas. Over the period 2000-2018, the area covered 
by these categories increased by around 2.596 a year. Cate- 
gories 2 and 4 (agriculture and scrubs) present a negative net 
change rate, indicating that their areas have been shrinking. 
The change matrix gives us more information about the 
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Table 2 Results from Exercise 1 1 2 
displayed in the “output” window 


of the Change Statistics R script. 1 3301.70 3.41 
Change matrix 2 1853.16 52,059.37 
3 22.74 108.68 
4 12.37 399.27 
5 0.00 0.00 
6 0.00 0.00 


processes of change. One surprising change is the transition 
from 1 (built-up) to 6 (water). On closer observation, it was 
found that pits had been filled with water to create reservoirs. 


2 Areal and Spatial Agreement Metrics 
Description 


Different authors have proposed a series of metrics that 
evaluate the areal and spatial agreement between two land 
use/cover maps or between any of their categories. These 
metrics are obtained from the cross-tabulation matrix and 
summarize in a single value the agreement between two 
maps. 

The metrics are based either on the comparison of the 
proportion of total area occupied by a particular category on 
two maps or on the spatial coincidence of the pixels allo- 
cated to any given category on two maps. This review 
includes some of the most recently developed metrics. 

Yang et al. (2017) proposed the overall spatial agreement 
(Ao) and the individual spatial agreement (A;) metrics. They 
are formulated as follows: 


N 
XYi 


XYj 


m (X; -- Yj)/2 


x 100 
where X; refers to the number of pixels belonging to cate- 
gory i in map X, Y; refers to the number of pixels belonging 
to category i in map Y, XY; refers to the number of pixels 
belonging to category i in both maps X and Y, N is the 
number of categories into which the pixels are classified and 
M is the number of pixels into which the maps are divided. 
The overall spatial agreement (Ao) and the overall spatial 
inconsistency (OSI) metrics assess the spatial agreement 
between the categories in two maps. One metric can be 
obtained from the other. Whereas Ao shows the spatial 
agreement (0—10096), the OSI shows the spatial disagree- 
ment (0—10096). Added together, they come to 100. 
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3 4 5 6 
0.02 8.17 0.00 36.59 
235.25 408.86 0.00 1.57 
39,232.14 127.07 0.00 0.00 
876.72 11,418.38 21.89 0.00 
0.02 10.04 920.81 0.00 
0.00 0.00 0.00 76.40 


Islam et al. (2019) proposed the overall areal inconsis- 
tency (OAD, the individual areal inconsistency (AIC) and 
the overall spatial inconsistency (OSI) metrics. They are 
formulated as follows: 


AIC = |(X;— Y))|/2 


OAI = Y AIC 


Nous 

(Ai) 

OSI = —— x 100 
N x 


where X; refers to the percentage of the total area represented 
by category i in map X, Y; refers to the percentage of the 
total area represented by category i in map Y, n is the total 
number of categories, N is the number of pixels and N(;4) is 
the number of pixels assigned to one category in Map X and 
a different category in Map Y. 

Overall areal inconsistency (OAI) shows the agreement 
between two maps in terms of category proportions and is 
expressed in values of between 0 and 100. Users can also 
assess the areal and spatial agreement/disagreement at a 
category level through the individual areal inconsistency 
(AIC) and individual spatial agreement (A;) metrics. The 
values for the latter range from 0 to 100, and a value of 100 
means perfect agreement. 

AIC does not have a standard scale of values, as these 
depend on the proportion of the total area of the map allo- 
cated to the category. It is therefore very difficult to compare 
the values for this metric between classes, so limiting its 
usefulness. 


Utility 


Exercises 


1. To validate a map against reference data/map 
2. To validate a simulation against a reference map 
3. To validate simulated changes against a reference map of changes 


The areal and spatial agreement metrics assess the similarity 
between the two maps. They are obtained from the 
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cross-tabulation matrix and therefore do not provide any 
additional information, in that the values they provide can 
also be obtained from the matrix. However, they are standard 
metrics that allow us to measure the agreement between two 
maps and summarize it in a single figure. In this sense, they 
are similar to the user's and producer's accuracy metrics and 
to Kappa indices. They are also complementary to quantity 
and allocation (dis)agreement metrics, as they can differen- 
tiate between spatial and quantity agreements. 

These metrics can be used to assess how similar a land 
use/cover map is to another map used as a reference, i.e. the 
real situation on the ground. They can also be used to check 
the similarity between a simulation and the reference map 
for the same year. 


QGIS Exercises 


Available tools 


* Processing Toolbox 
R 

Areal and spatial agreement metrics 
Individual Areal Inconsistency.rsx 
Individual Spatial Agreement.rsx 
Overall Areal Inconsistency.rsx 
Overall Spatial Agreement.rsx 
Overall Spatial Inconsistency.rsx 


QGIS has no specific tool for calculating the metrics pro- 
posed by Yang et al. (2017) and Islam et al. (2019). How- 
ever, these can be easily calculated using the cross-tabulation 
matrx via the formulae set out above. We have also 
developed various different tools with R to automatically 
calculate each metric with QGIS. 

When using these R scripts, the categories in LUC rasters 
must be coded in consecutive numbers, from 1 to the max- 
imum number of categories used in the map. Thus, in a raster 
with five categories, the categories must be coded as 1, 2, 3, 
4 and 5. 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To validate the CORINE 2011 land use map, take the SIOSE 
2011 land use map as a reference. We will be focusing 
particularly on how the “urban fabric" and “industrial and 
commercial areas" categories are mapped in CORINE 2011. 


Materials 


CORINE Land Use Map Asturias Central Area 2011 
SIOSE Land Use Map Asturias Central Area 2011 


Requisites 


All maps must be rasters and have the same resolution, 
extent, projection and number of categories. LUC categories 
must be coded consecutively from 1 to the maximum 
number of categories considered. 


Execution 


If necessary, install the plugin Processing R provider and 
download the R scripts indicated above in the "Available 
Tools" table. Paste the R scripts into the R scripts folder. For 
more information, see Chapter *About This Book". 


Step 1 


Our maps do not comply with one of the requisites of the 
tools we will be using, in that the categories in our LUC 
maps are coded from O (agricultural areas) to 12 (back- 
ground). The first step is therefore to reclassify the maps to 
ensure that all the categories are coded consecutively from 1 
to 13. This is done using the Reclassify by table tool (Figs. 2 
and 3). 


Step 2 


Once the maps comply with the requirements of the tools, 
the different metrics can then be calculated. To test the 
overall agreement between the assessed and the reference 
maps, we will calculate the overall spatial agreement (Ag), 
the overall areal inconsistency (OAI) and the overall spatial 
inconsistency (OSI). For their part, individual areal incon- 
sistency (AIC) and individual spatial agreement (Aj) are 
used to assess agreement specifically for the “urban fabric” 
and “industrial and commercial areas” categories. 

To calculate all these metrics, open the respective tool 
and select the maps you want to compare (Fig. 4): first the 
CORINE map and second the SIOSE map, which is used as 
a reference. In all cases, the background value of the maps 
(13) must also be indicated. Finally, specify the folder where 
the result from each tool will be stored. 

For class-specific metrics indicate the codes of the classes 
you want to validate (Fig. 5). In this case, we will be cal- 
culating these metrics for two different classes: urban fabric, 
which is coded 3 after reclassification, and industrial and 
commercial areas, which is coded 4. 
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Q Reclassify by Table 


Parameters Log 
Raster layer 

Sa" CORINE. 2011 [EPSG:32630] v 
Band number 

Band 1 (Gray) z 
Reclassification table 

Fixed table (13x3) 

> Advanced Parameters 
Reclassified raster 


V. Open output file after running algorithm 


Fig. 2 Exercise 1. Step 1. Reclassify by Table 


= Minimum Maximum Value Add Row 
1 |-1 0 1 Remove Row(s) 
2 ló 1 2 Remove All 
3/1 2 3 K 

Cancel 

4 |2 3 4 
5 |3 4 5 
6 4 5 6 
75 6 7 
8 6 7 8 
97 8 9 
10.8 9 10 
11/9 10 11 
" 11 12 
13/11 12 13 


Fig. 3 Exercise 1. Step 1. Reclassification table of the Reclassify by Table tool 
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Q Overall Spatial Agreement 


Parameters Log 
Land use map 1 

>" C11 redassified [EPSG:32630] 
Land use map 2 
| $ S11 redassified [EPSG:32630] 
Background value 

13,000000 
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x 


Overall Spatial Agreement 


C: 
\Users\David\AppData Roaming \QGIS\QGIS3\p 
rofiles\default processing scripts \Overall_spati 
al_agreement.rsx 


A0 


agreement metrics/Overall spatial agreement.csv 


[V] Open output file after running algorithm 


Run as Batch Process... 


Fig. 4 Exercise 1. Step 2. Overall Spatial Agreement R script 


Q Individual Spatial Agreement 


Parameters Log 
Land use map 1 

>" C11 redassified [EPSG:32630] 
Land use map 2 

=" S11 redassified [EPSG: 32630] 


Background value 


0% 


Individual Spatial 


Agreement 
a€} 


C 

\Wsers\David\AppData Roaming \QGIS\QGIS3\p 
rofiles \de fault processing scripts \Individual_sp 
atial_agreement.rsx 


[13,000000 


LUC dass to be evaluated 


[3,000000 


€! 


A1 


frics /Individual spatial agreement urban fabric.csv 


[V] Open output file after running algorithm 


Run as Batch Process... 


Fig. 5 Exercise 1. Step 2. Individual Spatial Agreement R script 


Results and Comments 


After calculating all the different metrics, a numerical output 
is obtained for each one (Tables 3 and 4). This output is also 
stored in a CSV file in the selected folder. 


There is a high overall spatial agreement (close to 9096) 
between the two maps and low areal inconsistency (around 
3%). We can therefore consider the CORINE land cover 
map for 2011 as validated. The category proportions 
between CORINE and SIOSE are almost identical and the 
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Table 3 Results from Exercise 
1. Overall agreement indices 


Metric 
Overall spatial agreement (Ao) 
Overall areal inconsistency (OATI) 


Overall spatial inconsistency (OSI) 


Table 4 Results from Exercise 
1. Individual agreement indices 


Metric 


Individual spatial agreement (Aj) 


Individual areal inconsistency (AIC) 


spatial agreement is very high. The disagreements between 
the two maps are due to their different degree of detail, 
which draws small features in SIOSE that are not detected at 
the scale used in CORINE. 

At the class level, the picture is slightly different. For the 
two classes we assessed (urban fabric and industrial and 
commercial areas) spatial agreement between the two maps 
to be close to 7096. Although this is a high level of agree- 
ment, it is much lower than the overall figure. This could be 
due to the fact that these two classes are more sensitive than 
others to the scale difference between SIOSE and CORINE. 

In order to interpret the AIC metric, we need to first 
understand the proportion of total area allocated to each class 
on the two maps. AIC is half of the difference between the 
two proportions (i.e. if the proportion allocated to one class 
is 396 on one map and 4^6 on the other, the difference is 196 
and AIC is 0.5). In our case, the AIC value for urban fabric 
is less than 0.1, which means a high level of agreement 
between the two maps regarding the proportion of total area 
allocated to this category (around 3.9%). The proportion 
allocated to industrial and commercial areas is around 3% in 
both maps and the AIC value is slightly more than 0.1. This 
also indicates a high level of agreement, although less than 
for urban fabric. 


Exercise 2. To validate a simulation against a ref- 
erence map 


Aim 


To validate the simulation obtained by our land use/cover 
change modelling exercise. We will focus on the two cate- 
gories we have modelled actively: “urban fabric” and “in- 
dustrial and commercial areas". 
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Value 
86.85 
3.11 

13.15 


Urban fabric 
69.95 
0.05 


Industrial and commercial areas 
67.62 
0.14 


Materials 


CORINE Land Use Map Asturias Central Area 2011 
Simulation CORINE Asturias Central Area 2011 


Requisites 


All maps must be rasters and have the same resolution, 
extent, projection and number of categories. LUC categories 
must be coded consecutively from 1 to the maximum 
number of categories considered. 


Execution 
Step 1 


The first step is to reclassify our maps to make them comply 
with the requisites of the tools we will be using. These tools 
require the categories to be consecutively coded from 1. This 
means that “agricultural areas" (coded 0) must be given a 
new code (Fig. 3). This is done using the Reclassify by table 
tool (see the previous exercise). 


Step 2 


Once the maps comply with the requirements of the tools, 
we can then calculate the different areal and spatial agree- 
ment metrics using the tools available in the R toolbox. 

To evaluate the global agreement between the simulation 
and the reference map, we will calculate the overall spatial 
agreement (Ao), the overall areal inconsistency (OAI) and 
the overall spatial inconsistency (OSI). To evaluate agree- 
ment for the categories that we actively modelled, we will 
calculate the individual areal inconsistency (AIC) and the 
individual spatial agreement (Aj). 

To calculate the metrics, open the corresponding tools 
and indicate the following: the simulation to be evaluated, 
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Q Individual Areal Inconsistency 


Parameters Log 
Land use map 1 

a” Simulation C11 redassified [EPSG:32630] v 
Land use map 2 

a" C11 redassified [EPSG:32630] v 
LUC dass to be evaluated 
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"| Individual Areal 
| Inconsistency 


Ge 
Wsers David AppData Roaming \QGIS\QGIS3\p 


[3,000000 a 


Background value 
13,000000 

AIC 

im. Individual, areal inconsistency urban fabric.csv | 

[7] Open output file after running algorithm 

Prop Cat map 1 

reement metrics/Prop Urban Fabric Smulation.csv | 

[V] Open output file after running algorithm 

Prop Cat map 2 

[ement. metrics/Prop. Urban. Fabric. Reference map.j 

[V] Open output file after running algorithm 


Run as Batch Process... 


Fig. 6 Exercise 2. Step 2. Individual Areal Inconsistency R script 


the reference map (CORINE 2011), the background value of 
the maps (13) and the folder where the results will be stored. 
For the class-specific metrics, you must also provide the 
codes of the classes you want to evaluate: in this case 3 
(urban fabric) and 4 (industrial and commercial areas) 
(Fig. 6). 


Results and Comments 


Once you have finished the exercise, you will obtain an a 
CSV file for each metric. The results are summarized in 
Tables 5 and 6. 

The results show almost perfect agreement between our 
simulation and the reference map. The maps share the same 


Table 5 Results from Exercise 2. Overall agreement indices 


Metric Value 
Overall spatial agreement (Ao) 99.05 
Overall areal inconsistency (OAT) 0.26 
Overall spatial inconsistency (OSI) 0.96 


ak) 


0% 


Cen] 


Cancel 


Close 


Table 6 Results from Exercise 2. Individual agreement indices 


Metric Urban Industrial and 
fabric commercial areas 

Individual spatial 97.35 97.05 

agreement (A;) 

Individual areal 0.006 0.005 


inconsistency (AIC) 


LUC in 99% of their area and the areal inconsistency is 
insignificant (0.26%). A similar pattern is observed in the 
actively simulated classes. 

These results are misleading. There is perfect agreement 
between our simulation and the reference map in the per- 
sistence areas. However, it is not that high for those areas 
modelled as changes. Because there are relatively few 
changes in our study area, the disagreement between the two 
maps in areas where change is predicted has very little 
impact on the overall high levels of the agreement created by 
the correct simulation of permanence areas. To correctly 
validate the changes that we simulated, we should repeat this 
exercise, focusing exclusively on the areas that changed in 
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the simulation and in the reference map, as compared to the 
initial map of the simulation (see next exercise). 


Exercise 3. To validate simulated changes against a 
reference map of changes 


Aim 


To validate the changes simulated by our land use/cover 
change modelling exercise. 


Materials 


CORINE Land Use Changes Asturias Central Area 2005- 
2011 
Simulated CORINE changes Asturias Central Area 2005- 
2011 


Requisites 


All maps must be rasters and have the same resolution, 
extent, projection and number of categories. LUC categories 
must be coded consecutively from 1 to the maximum 
number of categories considered. 


Minimum Maximum 
1 |-1 0 1 
2 0 1 1 
3.11 2 1 
42 3 2 
5 3 4 3 
6 4 5 1 
715 6 1 
8 6 7 1 
9 |7 8 1 
108 9 1 
119 10 1 
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Execution 


Step 1 


Our maps do not comply with the requirements for the tools. In 
the map of simulated changes, the categories are not consecu- 
tively coded from 1. In addition, the reference map of changes 
has many more categories than the map of simulated changes. 
Using the Reclassify by table tool we can adjust the number of 
categories on the two maps to the two categories that appear in 
both (urban fabric and industrial and commercial areas), plus a 
third category covering non-changing areas and changes that 
were not simulated. These categories will be assigned codes 1,2 
and 3, respectively. Figures 7 and 8 show the reclassification 
codes that must be inputted into the Reclassify by table tool. 


Step 2 


After reclassifying the maps, we will calculate the following 
metrics to validate the simulated changes: individual areal 
inconsistency (AIC) and individual spatial agreement (Aj). 
As we are only comparing two categories, the overall met- 
rics provide the same information as the individual ones. 
For each metric, we will open the corresponding tool, 
indicating the map of simulated changes to be validated (Land 
use map 1), the reference map of changes (Land use map 2), 
the background value of the maps (0), the category we are 


Value Add Row 
Remove Row(s) 
Remove All 
OK 


Cancel 


Fig. 7 Exercise 3. Step 1. Reclassification table of the Reclassify by Table tool (CORINE changes) 
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Minimum Maximum 
UU 0 
22 3 
33 4 


Value Add Row 
1 Remove Row(s) 
2 Remove All 
OK 
3 
Cancel 


Fig. 8 Exercise 3. Step 1. Reclassification table of the Reclassify by Table tool (Simulated CORINE changes) 


going to evaluate (urban fabric, 2, Fig. 9; industrial and 
commercial areas, 3, Fig. 10) and the folder where the results 
of the analysis will be stored. We use 999 as the background 
value in our maps because no specific value was assigned to 
the background. 0 means no change, another category that 
must be considered in this analysis. 


Results and Comments 


A CSV file will be created for each metric. The results are 
summarized in Table 7. 

The same amount of changes took place in the reference 
map of changes as in our simulation. There is no disagree- 
ment on this point. However, unlike the previous exercise, 


Q 


Parameters Log 
Land use map 1 


as” Simulated changes CORINE 05 11 redassified [EPSG:32630] 


Land use map 2 
V" Changes CORINE 05 11 redassified [EPSG:32630] 
Background value 


the spatial agreement between the simulated and the refer- 
ence changes was very low. The A; value for the two cate- 
gories that were actively simulated was quite similar (less 
than 25%). 

These results mean that only a quarter of the simulated 
changes were allocated in the same places as the changes 
observed on the reference map. This result, by itself, is not 
sufficient to consider the simulation invalid. We need to gain 
a better picture of the location of the changes that were 
simulated and their pattern. Even if they were not allocated 
in exactly the same places as on the reference map, they may 
be allocated in the same general area and follow a similar 
pattern, indicating that the model has correctly simulated the 
processes of change. To assess these aspects, we can perform 


” | Individual Spatial 
Agreement 


990000 — 55 
LUC dass to be evaluated 


[2,000000 
Al 


[agreement metrics/Sm. changes. Individual, spatial, agreement industrial 4E 


[7] Open output file after running algorithm 


Run as Batch Process... 


0% Cancel 


Fig. 9 Exercise 3. Step 2. Individual Spatial Agreemen R script (urban fabric) 
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Q Individual Spatial Agreement X 
> 
Parameters | Log Individual 
Land use map 1 Spatial 
t=" Simulated_changes_CORINE_05_11_reclassified [EPSG:32630] * Agreement 
Land use map 2 C: 
\Users\David\AppData\Roa 
te" Changes_CORINE_05_11_reclassified [EPSG:32630] v || ming\QGIS\QGIS3\profiles 
\default\processing\rscripts 
Background value \Individual_spatial_agreem 
TA ent.rsx 
(0,000000) i| ^ 
LUC dass to be evaluated 
3,000000 a; 
A1 
t metrics/Sim changes Individual spatial agreement industrial.csv € ^... 
V. Open output file after running algorithm 
096 Cancel 
Run as Batch Process... Run | Close 


Fig. 10 Exercise 3. Step 2. Individual Spatial Agreement R script (industrial and commercial areas) 


Table 7 Results from Exercise 3.Individual agreement indices 


Metric Urban fabric 
Individual spatial agreement (A;) 22.37 
Individual areal inconsistency (AIC) 0 


a visual inspection of the reference and simulated changes on 
the maps, cross-tabulate them at multiple resolutions (see 
Sect. 2 in Chapter “Basic and Multiple-Resolution Cross- 
Tabulation to Validate Land Use Cover Maps") and calcu- 
late the spatial metrics (see Sect. 1 in Chapter "Spatial 
Metrics to Validate Land Use Cover Maps"). 


3 Kappa Indices 
Description 


Kappa indices assess the agreement between two sources of 
spatial data, corrected by the agreement that is expected by 
chance. They are typically used to compare the agreement 
between two maps and to compare one map with reference 
information (e.g. a collection of validation points). 

The first Kappa index (Cohen's Kappa) dates from 1960 
(Cohen, 1960) and has been widely used in LUC analysis. 
Many variants of this first original index have been pro- 
posed. They mainly apply to the comparison between two 
maps. Of these, the following are of particular interest: 


Industrial and commercial areas 
19.86 
0 


Pontius Jr. (2000) split Cohen's Kappa into three indices, 
called Kno, Kquantity and Klocation. These indices offer 
more information about the causes of the (dis)agreement 
between two compared maps, i.e. (dis)agreement in terms 
of the different allocation of the categories on the two 
maps and (dis)agreement in terms of the different pro- 
portions in which the categories appear on the two maps. 
Hagen (2002), following the work done by Pontius Jr., 
split Cohen's Kappa into two indices, called Khistogram 
and Klocation. These refer to the Kappa agreement in 
terms of the categories appearing in the same proportions 
(histograms) on the two maps and the Kappa agreement 
due to the categories appearing in the same location on 
the two maps. 

Van Vliet et al. (2011) proposed the Kappa simulation, 
which was specifically designed for validating LUCC 
models. It assesses the agreement between the changes on 
two maps, as compared to a third map used as an initial 
point, corrected by the agreement expected by chance. 
Hagen (2003) and Van Vliet et al. (2013) also incorpo- 
rated fuzzy logic into the calculation of Kappa indices, 
creating fuzzy Kappa and fuzzy Kappa simulation. They 
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took the degree of spatial and thematic mismatch into 
account when calculating Kappa. In other words, two 
maps may be said to show partial agreement if the vali- 
dation pixel or point is close to the compared pixel. The 
same would apply if the pixels were allocated to different 
classes, but with similar meanings. 


Utility 


Exercises 


1. To validate a map against reference data/map 

2. To validate a simulation against a reference map 

3. To validate a simulation against a reference map at the category 
level 


Kappa indices enable us to test the similarity between two 
sources of spatial information. If we have one map and 
reference data, we can determine to what extent the map we 
want to validate agrees with the reference data. 

The main advantage of Kappa indices is that they provide 
a standard measure. Kappa agreement always ranges 
between —1 and 1, where 1 means total agreement, —1 total 
disagreement and 0 random agreement. These are universal 
measures, which means that the performance of a LUC 
classification exercise or a LUCC modelling exercise can be 
compared with the performance typically achieved in these 
exercises. 

There are many critics of the widespread use of Kappa 
metrics, especially in LUCC modelling validation. There is 
now a general consensus that these indices should not be the 
only validation measures used when evaluating modelling 
exercises and maps. More information about the limitations 
of Kappa indices and the criticisms levelled against them can 
be found in Pontius Jr. and Millones (2011) and Van Vliet 
et al. (2011). 


QGIS Exercises 


Available tools 


* Processing Toolbox 
GRASS 
Raster 
r.kappa 
* Semi-Automatic Classification Plugin 
Tab: Postprocessing 
Section: Accuracy 
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QGIS does not include many tools for calculating Kappa 
indices. The Cohen's Kappa index can be obtained through 
the associated GRASS module. The Semi-Automatic Clas- 
sification plugin also calculates the Kappa index, globally 
and at the category level, when doing the cross-tabulation 
(see Chapter “Basic and Multiple-Resolution Cross-Tabu- 
lation to Validate Land Use Cover Maps"). The other vari- 
ants of Kappa are not available through QGIS or any of its 
pattern software, like R. Those who would like to calculate 
these indices are referred to the Map Comparison Kit, which 
is also available for free.' 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To test the validity of the CORINE 2011 land use map, take 
the SIOSE land use map as a reference. In this way, we can 
answer the following question: assuming that the SIOSE 
map shows the true situation, how true is the CORINE map? 


Materials 


SIOSE Land Use Map Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 

The two maps must be rasters and have the same extent, 
spatial resolution, projection and legend. If they do not have 
the same legend, the user must reclassify the maps in such a 
way that they comply with this requirement. 

Execution 

Step 1 

Open the r.kappa function and fill in the required parame- 
ters: raster to be validated (CORINE map) and reference 
raster (SIOSE map) (Fig. 11). 

Results and Comments 

Once the function has been executed, QGIS creates a new 


text file (.txt) in the specified folder. Users must manually 
access this folder to open the text file and see the results of 


! http://mck.riks.nl/downloads. 
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Q r.kappa 


Parameters Log 


J.-F. Mas et al. 


Raster layer containing dassification result 


a" C11 [EPSG:32630] 


Raster layer containing reference dasses 


=" 511 [EPSG:32630] 
Title for error matrix and kappa 


v a 


ACCURACY ASSESSMENT 


[C] No header in the report 
[C] Wide report (132 columns) 
"w Advanced parameters 


GRASS GIS 7 region extent (xmin, xmax, ymin, ymax) [optional] 


0,000000 


GRASS GIS 7 region cellsize (leave 0 for default) 


Error matrix and kappa 


C:/Users/David/Desktop/LU_exercises/Kappa_index. txt 


Fig. 11 Exercise 1. Step 1. R.kappa 


the analysis. These include a cross-tabulation matrix of the 
maps, together with the Kappa value. For the two maps 
assessed, we obtained the following Kappa: 


Kappa — 0.88 


where 1 means total agreement, —1 total disagreement and 0 
random agreement. A Kappa index value of 0.88 means that 
the two maps are very similar and therefore that our map has 
been validated. As a general rule, Kappas above 0.7—0.8 are 
considered good enough for validation. Kappas above 0.9 
indicate very high agreement. 

In our case, it is always important to bear in mind that 
SIOSE is made at a more detailed scale than CORINE. The 
two maps have different minimum mapping units and min- 
imum mapping widths, which means that perfect agreement 
is impossible. The SIOSE map will always draw features 
that are not detected in CORINE because of its coarser scale. 


Kappa scores of almost 0.9, like this one, show almost 
perfect agreement between the two sources. 

Users can also assess the agreement between CORINE 
and SIOSE at the category level so as to obtain more 
information about the similarities and dissimilarities between 
the two maps. To compute these metrics, they should refer to 
Exercise 3, using the Semi-Automatic Classification Plugin 
instead of r.kappa. 


Exercise 2. To validate a simulation against a ref- 
erence map 


Aim 


To validate the simulation obtained by our land use/cover 
change modelling exercise. 
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Materials 


Simulation CORINE Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 

The two maps being compared must be rasters and have 
identical resolution, extent, projection and legend. For 
proper validation, the reference map must refer to the same 
date for which the landscape was simulated. 

Execution 

Step 1 

Open the r.kappa function and fill in the required parame- 


ters: raster to be validated (Simulation) and reference raster 
(CORINE 2011) (Fig. 12). 


Q r.kappa 


Parameters Log 


Results and Comments 


QGIS will create a text file in the specified folder. This file 
contains the Kappa value for our simulation: 


Kappa — 0.99 


where 1 means total agreement. The Kappa value indicates 
that the two maps are almost the same. However, this does 
not mean that the changes we simulated are the same as the 
changes that took place on the reference map (CORINE 
2011) as compared to the map used as the starting point for 
our modelling exercise (CORINE 2006). 

In our simulation, most of the landscape remains unchan- 
ged. The high Kappa value indicates that we have correctly 
modelled the persistence of these unchanged areas. However, 
it is difficult to draw any meaningful conclusions about how 
closely the changes we simulated fit the changes observed 
between the CORINE 2011 and 2006 maps. These changes 


Raster layer containing classification result 


a" Simulation C11 [EPSG:32630] 


Raster layer containing reference dasses 


=" C11 [EPSG:32630] 
Title for error matrix and kappa 


ACCURACY ASSESSMENT 


[C] No header in the report 
[C] Wide report (132 columns) 
"w Advanced parameters 


GRASS GIS 7 region extent (xmin, xmax, ymin, ymax) [optional] 


Fig. 12 Exercise 2 Step 1. R.kappa 


extent 
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only affect very small parts of the maps and, therefore, do not 
have a meaningful impact on the Kappa index when evalu- 
ating the agreement between the entire area of the maps. 

In order to gain a better picture as to how well the sim- 
ulated changes fit the changes in the reference maps, other 
complementary metrics also described in this book can be 
used, such as the quantity and allocation disagreement or the 
figure of merit (see Sects. 3 and 4 in Chapter “Pontius Jr. 
Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps"). The agreement between simulated 
and reference changes can also be assessed using Kappa 
simulation, although this metric is not currently implemented 
in any tool in QGIS or in its associated software, such as R. 

Users can also evaluate the kappa agreement between the 
simulation and the reference map at the category level, for 
which purposes they should refer to the next exercise, 
Exercise 3. 


Exercise 3. To validate a simulation against a ref- 
erence map at the category level 


Aim 
To validate a simulation obtained by our land use/cover 


change modelling exercise at the general and category level, 
focusing on a specific category. 


J.-F. Mas et al. 
Materials 


CORINE Land Cover Map Val d'Ariége 2018 
Simulation LCM Val d'Ariége 2018 


Requisites 


The two maps to be compared must be rasters and have 
identical resolution, extent, projection and legend. For 
proper validation, the reference map must refer to the same 
date for which the landscape was simulated. 


Execution 
Step 1 


The Kappa index can be calculated at the category level for 
all the categories in our map using the Semi-Automatic 
Classification Plugin. To this end, open the plugin and select 
the Accuracy (Postprocessing) option from the menu. Then 
choose the rasters to be assessed, i.e. the simulation and the 
reference map (Fig. 13). It is also important to indicate the 
code for no data or background. In our case, the code is 10. 


Results and Comments 


After executing the tool, we obtain a raster that 
cross-tabulates the compared maps and a CSV file with the 


c Semi-Automatic Classification Plugin — Cl 2€ 
Filter E 
— EI 
2 Bandset ^ Input 8 
im - 
> LK. Basic tools 
& Download p: e 
> [b Preprocessin Select the classification to assess CLC predict, 2018 v||o i 
m > Band proces: Select the reference vector or raster CLC_2018 v ||O 
k Band comb "E 
pl Classificatio ALMUS 
ls Clustering Use value as NoData 10 < 
VE PCA 
Random foi Run 
Az Spectral dis 
M q Postprocessii BATCH C RUN > 
n! Accuracy 
B Classificatio Output = 
n Classificatio ~ 


Fig. 13 Exercise 3 Step 1. Semi-Automatic Classification Plugin 
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Table 8 Results from Exercise 3. Kappa indices: overall and per category 
Overall ^ Built-up Agricultural Forests Shrub and/or herbaceous Open spaces with little or no Water 

areas (1) areas (2) (3) vegetation (4) vegetation (5) surfaces (6) 
0.9849 0.9092 0.9699 0.9993 0.9644 1.0000 1.0000 


cross-tabulation matrix, the overall, user's and producer's 
accuracy values and the Kappa indices of agreement, overall 
and per category. This information will also be displayed in 
the output window. For detailed information about how to 
interpret the matrices and the user's and producer's accuracy 
values, please refer to the Sect. 4 in Chapter "Pontius Jr. 
Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps". 

The Kappa values for the two maps show high levels of 
agreement at both a general level and for all categories 
(Table 8). The Kappa values for the *Open spaces with little 
or no vegetation" and “Water surfaces" categories are 1, 
which means perfect agreement. In other words, there are no 
differences between the two maps for these classes. This 
makes sense because they were not simulated in our mod- 
elling exercise. 

The class with the lowest Kappa value is “Built-up areas". 
This indicates that many of the changes in this category have 
not been correctly simulated, which is to be expected given 
the dynamism of this category when compared with others 
such as forest or water surfaces. It is normally easier to 
simulate static land categories than changing ones. This 
explains why “Built-up” areas obtained a very low Kappa 
score compared to the overall score (Table 8). 

Although these results offer some clues as to how well the 
changes in some categories were simulated, to obtain a more 
detailed understanding other methods and metrics should be 
used, such as the quantity and allocation disagreement and 
the figure of merit (see Sects. 3 and 4 in Chapter “Pontius Jr. 
Methods Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps”) or the Kappa simulation metrics. 
Whereas the Kappa metrics calculated here assess the 
agreement between persistent and changing areas in the 
compared and the reference maps, the other tools and 
methods focus on the specific areas that change between the 
initial and the final year of the simulation. This is a key 
element for understanding the success of our simulation, as it 
is easier to model persistence than change. 


4 Agreement Between Maps at Overall 
and Stratum Level 
Description 


The aim is to assess the agreement between map pairs such 
as a reference map and a simulation map, at different levels: 


overall agreement for the whole map, agreement for a given 
stratum, a smaller area, formed by a particular territory, LUC 
category or transition or by sample areas according to a 
gradient such as distance to a road. The purpose of this 
validation method is encapsulated in the following question: 
Does a particular item or area of interest show the same 
prediction score as the whole map? 


Utility 


Exercises 


1. To validate simulated changes against a reference map of changes 


A given map (LUC map, simulation) can be evaluated more 
precisely at spatial level (specific territory), category level (Is 
the simulation closer to the real situation for built-up areas 
or for forests?) or specific transitions (Does the model work 
better for the transition from forest to agriculture or from 
forest to pasture?). In this context, the entire area of interest 
can be used as a guide for interpreting particular simulation 
scores. 


QGIS Exercise 


Available tools 


* Raster 
Raster calculator 
* Processing Toolbox 
GRASS 
Raster (r.*) 
r.kappa 
Raster analysis 
Reclassify by table 
Raster layer unique values report 


Agreement between maps at the overall and stratum levels is 
more a validation approach than a specific method. Accord- 
ingly, there are no specific tools available in QGIS to carry 
out this analysis, as the used tool will depend on what type of 
analysis will be carried out at the overall and stratum levels. 

For general operations, we will make use of the QGIS 
Raster Calculator. a generic tool performing all kinds of 
raster calculations. To calculate Kappa indices at the global 
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and stratum levels, we will make use of r.kappa. For more 
information about this tool, please refer to the previous 
section. 


Exercise 1. To validate simulated changes against a 
reference map of changes 


Aim 


To find out if the agreement between an observed (reference 
map) and a simulated transition varies for several 
distance-based categories resulting from a driver (e.g. dis- 
tance to roads). 


Materials 


CORINE Land Cover Map Val d’Ariége 2012 
CORINE Land Cover Map Val d'Ariége 2018 
Simulation LCM Val d'Ariége 2018 

Distance to roads 
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Requisites 


All maps must be in raster format with the same resolution, 
extent and spatial reference system (SRS). 


Execution 
Step 1 


First, we have to obtain the observed and simulated transitions 
from agriculture and pasture land to built-up areas over the 
period 2012-2018. Using the raster calculator, we extract the 
observed (“CLC_2012@1” 22 AND "CLC 201891" = 1) 
and the simulated (CLC 201291" 2 2 AND "CLC pre- 
dict 20189? 1^ = 1) transition from agriculture and pasture 
land (Category 2) to built-up areas (Category 1). The result is 
shown in Fig. 14 (observed change appears in cyan, simulated 
change in red). 


Step 2 


The Reclassify by table raster analysis tool is used to transform 
the map showing the continuous distance from roads into 


^e 


UN 


e 


[7] Observed 
Bl Simulated 


20 km 


Fig. 14 Exercise 1. Step 1. Intermediate maps showing observed and simulated transitions from agriculture and pasture to built-up areas 
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QR 


Parameters | Log 
Raster layer 
WW roads dist [EPSG:32631] M 
Band number 
Band 1 (Gray) 
Redassification table 
Fixed table (4x3) 
b Advanced parameters 
Redassified raster 


Q Fixed table 


Minimum 


Fig. 16 Exercise 1. Step 2. Intermediate map showing the distance from roads reclassified by intervals 
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various different classes. Given the dense road network, we itself (distance is zero), distance class 1 (less than 100 m), 
intentionally apply a progressive interval as shown in Fig. 15. class 2 (100-300 m), class 3 (300-1000 m) and class 4 


Figure 16 shows the general result and the result for a (more than 1000 m). 
detailed area with the following classes: the road network 
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Simulated distrance dassified transition 
m : 
E 2 


Fig. 17 Exercise 1. Step 3. Intermediate maps showing observed (left) and simulated (right) transition from 2 to 1 as a function of the road 


distance classes 


Step 3 


The next step is to compute observed and predicted transitions 
from Category 2 to Category 1 as a function of the road dis- 
tance classes. To this end, we use the Raster calculator again 
to calculate: 1) the road distance class map multiplied by the 
observed transition map and ii), the road distance class map 
multiplied by the simulated transition map. The results can be 
seen in Fig. 17, in which the two maps show the transition 
from 2 to 1 as a function of road distance. The map on the left 
shows the observed transition and the map on the right shows 
the simulated transition, with a detailed area in both cases. 


Step 4 


Finally, we compare observed and simulated transitions as a 
function of distances classes (strata). We use the Raster layer 


unique values report raster analysis tool to calculate the 
number of pixels for each road distance category (observed and 
simulation) for the transition from category 2 to 1 as shown in 
Fig. 18 (left for observed, right for simulated transition). 

The results are then converted into percentage as shown 
in Table 9. 


Results and Comments 


The result is that there are almost three times as many 
observed transitions as predicted transitions. However, the 
proportion of near-to-road transitions is approximately the 
same. In conclusion, the model underestimates the quantity 
of agriculture and pasture land that is transformed into 
built-up areas, although in the areas close to roads, it accu- 
rately predicted what happened in the Ariége Valley between 
2012 and 2018. 
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J © & B J 
value count m! 

2 2,00000000 3600 810329,07301594 

3 1,00000000 38169 8591513,99665156 2 

4 0 8203584 1846556282813.. 3 


Q Unique values table : Features Total: 4, Filtered: 4, Selec Q Unique values table :: Features Total: 3, Filtered: 3, Selected: 0 


© asg 


count m 


967 217663,39266845 
13045 2936317,43263695 


0 8231704 1852885853,239... 


Fig. 18 Exercise 1. Step 4 presented in the “output” window. Number of cells and areas of observed (left) and simulated (right) transition from 2 


to 1 as a function of the road distance classes 


Table 9 Exercise 1 Step 4. Number and proportion of cells of observed and simulated transition from 2 to 1 as a function of the road distance 


classes 
Observed 
Pixels % 
«100 m 38,169 90.59 
100-300 m 3600 8.54 
300-1000 m 363 0.86 
Sum 42,132 


5 Accuracy Assessment Statistics 
Description 


The thematic accuracy assessment statistics are a set of 
parameters that measure the degree of agreement between 
the LUC map and the reference data (for more details about 
reference data, see Chapter “Sample Data for Thematic 
Accuracy Assessment in QGIS"). Overall accuracy, user's 
accuracy and producer's accuracy are reported in many 
studies. Some additional accuracy measures such as the 
standard error of overall accuracy and the confidence inter- 
vals for the adjusted areas are also helpful. 

All these parameters are mainly derived from the error 
or confusion matrix (see Chapter “Basic and Multiple- 
Resolution Cross-Tabulation to Validate Land Use 
Cover Maps") This matrix is obtained from a 
cross-tabulation between the reference data and the 
thematic map. In the resulting table, the reference data 
are generally shown in the columns and the map data in 
the rows (Table 10). 

In Table 10, nj; refers to the sample count of spatial units 
in cell (i, j), ni+, n+; denote the sum of n;; in each row and 
column, and n is the sample size; n,; is the number of spatial 
assessment units belonging to class j, according to the ref- 
erence data, and n;, is the number of spatial units belonging 
to class i according to the thematic map. 


Predicted 

Pixels % 
13,045 93.10 
967 6.9 
14,012 


Expressing the error matrix in terms of area proportions 
instead of sample counts enables the calculation of unbiased 
area estimators. The area proportions (P;) are defined as 


follows: 


nij 


py = Wi 
Ni+ 


where W; = (Map area of class i)/(Total area of the map). 
Based on these area proportions, the overall estimated 
accuracy (O), user's accuracy (0j) and producer's accuracy 


(Pj) are calculated with the following equations: 


LS 


O= j=1P ii 


_ Pü 
Pi+ 


Ü; 


— Pij 


P => 
P +j 


Errors of commission and omission are complementary 
concepts of the user’s and producer’s accuracy metrics, 
respectively (i.e. error = 1 — accuracy). An error of com- 
mission occurs when a feature is included in a thematic class 
to which it does not belong. In contrast, an error of omission 
occurs when a feature is excluded from the thematic class to 
which it belongs (Finegold et al. 2016). 
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Errors in the classification process can increase the uncer- 
tainty in area estimation. However, the pixel count multiplied 
by pixel size is often used as an estimator of the true area on the 
ground. This measurement is strongly affected by both omis- 
sion and commission errors (Gallego, 2004). Olofsson et al. 
(2013) proposed an unbiased area estimate using an adjust- 
ment factor obtained from the error matrix: 


Aj = Atotal X P +; 


Aj is the unbiased area estimator or adjusted area. In this 
case, the area estimator obtained directly from the map 
(Atotat) is then adjusted by a factor obtained from the refer- 
ence data. If there are more samples labelled as class j in the 


reference sample than in the map, then A j will be larger than 
the area obtained directly by pixel counting. 


Utility 


Exercises 


1. To validate a map against reference data/map 


The statistics obtained from the thematic accuracy assess- 
ment are not only descriptors of the map quality but also 
represent a fundamental input for calculating unbiased area 
estimators. Additionally, they provide the necessary ele- 
ments to decide whether to increase the number of sampling 
sites in the reference data, if the precision obtained does not 
meet the initial mapping objectives. 


QGIS Exercise 


Available tools 


* MapAccurAssess Plugin 


In QGIS, several plugins, such as Semi-Automatic Classi- 
fication, ACATaMa and MapAccurAssess, can be used to 
calculate the map accuracy statistics. All three plugins pro- 
vide the overall accuracy, producer's accuracy, user's 
accuracy and the error matrix, although AcATaMa and 
MapAccurAssess also report some additional statistics about 
the adjusted areas and their levels of accuracy. 

In this exercise, we use the MapAccurAssess plugin 
because it can use a shapefile directly with the reference 
data. The results provided by this plugin, based on Olofsson 
et al. (2013), include the error matrix and a table with the 
following statistics: the class area, the producer's and user's 
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accuracy values, the adjusted areas and their confidence 
intervals. It also includes the overall accuracy and its 
respective standard error. 

This plugin is a test version and has not yet been accepted 
in the official QGIS repositories. 


Exercise 1. To validate a map against reference 
data/map 


Aim 

To validate a LUC map for the Marqués de Comillas study 
area by computing accuracy assessment statistics and the 
error matrix via cross-tabulation of the reference data and the 
thematic map. 


Materials 


Marqués de Comillas Land Use Cover Map 2019 
Photointerpreted reference dataset—Marqués de Comillas 
2019 (reference dataset resulting from the exercise in Sect. 2 
in Chapter “Sample Sata for Thematic Accuracy Assessment 
in QGIS”) 


Requisites 


In order to compute the areas, the land cover map must be in 
raster format (GeoTiff) in any cartographic projection. The 
reference data must be contained in a shapefile with the same 
type of projection as the map. The shapefile attribute table 
must contain at least two columns, showing the value for the 
thematic class obtained from the land cover map and the 
value according to field ground-truthing or photointerpreta- 
tion. Both columns must have the same data type (integer or 
text) to be comparable. Each row of the table corresponds to 
one reference site. 


Execution 
Step 1 


Install the MapAccurAssess plugin. Should you need help, 
please see Chapter “About This Book" and the plugin’s 
documentation. 


Step 2 


If the plugin has been successfully installed, an icon should 
appearin the main graphics panel. To start the exercise, click on 
this icon. Alternatively, go to the Complements menu, select 
Accuracy Assessment and then Accuracy Assessment again. 
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Table 10 Confusion matrix 


Class Reference data 
1 2 

Mapdata 1 ny ni» 

2 n21 n22 

Q Lr Ng2 

Total N41 N42 

e 
Parameters | Record 


Reference Data (only shapefile): 
RandomSample Buffer 
Ground truth: 


i23refer data 123 class 


Land cover map: 
LandCoverMarques2019 


Class column (only shapefile): 


Result: 


Fig. 19 Exercise 1 Step 3. MapAccurAssess plugin 


Step 3 


Select the shapefile with the reference samples 
(Photo-interpreted reference dataset-—Marqués de Comillas 
2019)? and indicate the column with the reference data and 
the column with the values for the thematic classes used in 
the map. After that, select the land cover map you want to 


?The photointerpreted reference dataset for Marqués de Comillas 
(RandomSample_Buffer.shp) was obtained from the exercise in Sect. 2 
in Chapter “Sample Data for Thematic Accuracy Assessment in QGIS". 
This layer has two columns, "class" and "refer data". The first contains 
the values for the thematic classes used in the map and the second 
contains the reference data, which were obtained from the photoint- 
erpretation of satellite images. 


Map class column: 
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Total 
q 
Nig ny 
Doq N24 
Naq Ng+ 
Nig n 


Accuracy assessment 


Input parameter 


Reference Data: vector 
layer file (shp) with 
reference data. 


Ground truth column: 
Column of type string or 
numeric with validation 
data. 


Map class column: Column 
of type string or numeric 
with map class data to 
validate. 


Land cover map: vector 
layer file (shp) or raster file 
(tif) of land cover map to 
validate. 


Class column: Column of 
type string or numeric with 
coverage map classes 

(only shapefile). 


0% 


Cancel 


Accept 


assess (Marqués de Comillas Land Use Land Cover Map 
2019). If the map is in vector format, indicate the column 
containing the thematic class values. Finally, select a folder 
where the results will be saved and click “Accept” (Fig. 19). 


Results and Comments 


The output of this plugin consists of two CSV tables. The 
first contains the error matrix (Table 11), and the second 
contains the map accuracy assessment statistics (Table 12). 
These statistics are as follows: user’s accuracy, producer’s 
accuracy, thematic class area (as retrieved from the map), the 
area adjusted by the error level (Area_adj), the confidence 
intervals for the adjusted area (CI sup and CI inf) and the 
overall accuracy (O). 
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Table 11 Result from Exercise 1. Error matrix 


Classes 130 161 261 290 
130 35 0 0 14 
161 0 38 0 6 
261 0 0 50 0 
290 2 9 0 328 
301 0 1 0 13 
420 0 0 0 1 
50 0 0 0 5 
51 0 0 0 3 


Table 12 Results from Exercise 1, Step 3 presented in the second "output" CSV file (accuracy indices) 


Classes UsAcc ProdAcc Area 
130 0.7 0.21 252 
161 0.76 0.63 6943 
261 1 1 13,504 
290 0.95 0.96 116,429 
301 0.72 0.63 2357 
420 0.98 0.86 2021 
50 0.83 0.80 26,009 
51 0.82 0.86 32,875 
Overall Accuracy (O): 0.91 

Std(O): 0.0113 


According to the data from this exercise, the overall 
accuracy of the map is 0.91. In other words, there is a high 
probability (91%) that a randomly selected location on the 
map will be correctly classified. Note that the thematic class 
with the lowest accuracy is 130 (Wetland), with a user 
accuracy of 0.7 and a producer accuracy of 0.21. This class 
covers a small area (252 ha according to the map). We 
decided to keep this class to show that illogical situations 
can occur when there is only a small number of sampling 
sites, e.g. negative areas. However, we recommend merging 
class 130 with another class of similar characteristics and 
recomputing. 
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Abstract 


allocation. The Figure of Merit (FoM) (Sect. 4) technique is 
used to validate a set of LUC changes by comparing them 


Several validation techniques based on the cross-tabulation 
matrix can be applied to validate Land Use Cover 
(LUC) maps. The exercises in this chapter focus, in 
particular, on the cross-tabulation techniques proposed by 
Robert Gilmore Pontius Jr., who has developed many 
indices and techniques in this field. Given his major 
contribution to this family of validation techniques, we 
have associated his name here with cross-tabulation 
techniques without this in any way implying that his 
scientific activity is limited to this field. The null model 
(Sect. 1) is especially useful for validating simulations, 
comparing the modelled map to a reference map with full 
persistence. LUCC budget (Sect. 2) only focusses on 
changes, which it splits into different components. This 
method can be used to compare the changes we want to 
validate with a reference set of changes, so providing 
interesting information as to how well our maps capture the 
dynamics of the landscape. Quantity and allocation 
disagreement (Sect. 3) analyse the differences between 
the reference map and the map being validated using two 
indices: disagreement in quantity and disagreement in 


with a reference, distinguishing between different compo- 
nents of agreement: correctly simulated change, wrongly 
simulated or missing change. Incidents and States (Sect. 5) 
allows us to identify illogical transitions in a time series of 
maps by providing the number of states and transitions that 
a cell undergoes over the course of the series. Intensity 
analysis (Sect. 6) and Flow matrix (Sect. 7) also enable us 
to validate the logic of LUC changes in a time series of 
maps. Intensity analysis provides information on the speed 
of changes, identifying those transitions or changes that do 
not follow a logical trend, while the flow matrix enables us 
to spot unstable changes in a series of maps. In this chapter, 
we present examples of how these techniques can be used in 
different cases: to validate single LUC maps, to validate a 
series of maps with two or more time points, to validate 
simulated changes against a reference map of changes and 
to validate changes simulated by various models. All these 
techniques are illustrated by exercises using datasets from 
the Asturias Central Area and the Ariége Valley. 
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1 Null Model 


Description 


The null model is a method specifically developed by Pon- 
tius and Malanson (2005) to validate LUCC modelling 
simulations. It assumes that the land use/land cover at the 
simulation start time (tı) is exactly the same at the end time 
(t2) and that no changes take place. The aim is to evaluate 
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whether a landscape with no changes more closely resembles 
the reference landscape for the year of the simulation (t5) 
than the simulated landscape. In other words, we change the 
date of the initial LUC map while leaving the content 
unchanged. It then becomes a reference map (no change) 
with which we can measure the predictive power of the 
model. 

If the agreement between the observed LUC at t; and the 
simulation map at t is higher than that between observed 
LUC at t» and the so-called null model, the simulation has 
greater predictive power than the hypothesis of complete 
persistence (no change). The agreement between the null 
model, the simulation and the reference map is usually 
assessed using common cross-tabulation techniques and 
Kappa indices (see Sect. 1 in Chapter “Basic and Multiple- 
Resolution Cross-Tabulation to Validate Land Use Cover 
Maps" and Sect. 3 in Chapter "Metrics Based on a Cross- 
Tabulation Matrix to Validate Land Use Cover Maps"). 


Utility 


Exercises 


1. To validate simulated changes against a reference map of changes 


The null model helps to measure the relative success of a 
simulation compared to persistence in time. The usefulness 
of this method depends on the spatiotemporal dynamics of 
the study area. 

The method is based on the hypothesis that a simulation is 
successful if it gets better validation scores than a landscape in 
which no changes occur. When simulating change in a study 
area in which little change is taking place, it may be difficult to 
correctly simulate these changes in the same positions as on 
the reference map of changes. As a result, the null model may 
provide better validation scores than the simulation, in that the 
null model avoids possible errors when allocating changes 
and always simulates persistence correctly. This is why the 
null model is especially useful for validating whether an 
LUCC model simulates persistence correctly. 


QGIS Exercise 


Available tools 


* Processing Toolbox 
GRASS 
Raster (r.*) 
r.kappa 
* Semi-Automatic Classification Plugin 
Tab: Postprocessing 
Section: Cross-classification 
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To calculate the null model, we must use the same tech- 
niques as cross-tabulation and Kappa. Please see Sect. 1 in 
Chapter “Basic and Multiple-Resolution Cross-Tabulation to 
Validate Land Use Cover Maps” and Sect. 3 in Chapter 
"Metrics Based on a Cross-Tabulation Matrix to Validate 
Land Use Cover Maps" for details about how to compute 
cross-matrices and kappa indices between two raster layers. 


Exercise 1. To validate simulated changes against a 
reference map of changes 


Aim 


To find out if the prediction score obtained by the simulation 
map for 2018 is higher than that obtained by the null model. 


Materials 


CORINE Land Cover Map Val d’Ariége 2012 
CORINE Land Cover Map Val d'Ariége 2018 
Simulation LCM Val d'Ariége 2018 


Requisites 


All maps must be rasters and must have the same resolution, 
extent and projection. 


Execution 
Step 1 


The first step is to calculate the Kappa indices measuring the 
agreement between the simulation, the null model and the 
reference map showing observed LUC in 2018. We use the 
GRASS r.kappa raster tool to calculate the kappa values for 
agreement: (1) between observed LUC in 2012 duplicated in 
2018 (null model) and observed LUC in 2018 and (ii) be- 
tween observed LUC in 2018 and simulated LUC in 2018. 


Step 2 


We then generate the cross-matrices between the simulation, 
null model and reference map (CLC 2012 against 
CLC 2018 and CLC predict 2018 against CLC 2018) 
using the Cross-classification tool (see Exercise 2 of Sect. 1 
in Chapter Basic and Multiple-Resolution Cross-Tabulation 
to Validate Land Use Cover Maps"). This method comple- 
ments the kappa agreement indices and provides additional 
information about the similarity between the different maps. 


Pontius Jr. Methods Based on a Cross-Tabulation Matrix ... 
Step 3 


Once the cross-tabulations are obtained, on a spreadsheet we 
calculate the sum of cells on the diagonal (pixel-to-pixel 
correspondence). 


Results and Comments 


The resulting Kappa values are 0.9849 for the simulation 
(CLC predict 2018 related to CLC 2018) and 0.9875 for 
the null model (CLC 2012 related to CLC 2018). The 
quantity and allocation correspondence (the proportion of 
diagonal pixels in the cross-matrices) are 98.2296 for the 
simulation and 98.53% for the null model. Therefore, with 
both techniques, the null model obtains a slightly higher 
score than the simulation. 

Interpretation of these results is difficult and has to be 
done carefully due to the limitations of this technique and 
the criticisms often levelled against it. The results show that 
persistence is the dominant process (98.596 of the study area 
did not change between 2012 and 2018; null model). Taking 
into account that most models simulate persistence better 
than change, it would be difficult to obtain a higher pre- 
diction score for a study area in which so little land use 
change is taking place. The low proportion of changes 
makes it difficult to simulate the changes between land use 
categories correctly. The slightest error diminishes the per- 
formance of the simulation compared to the null model. 

Other methods, such as the Figure of Merit (see Sect. 4), 
can provide a better picture on how the model correctly 
simulated the change. 


2 LUCC Budget 
Description 


LUCC budget is a technique for analysing land use/cover 
change (LUCC) using the cross-tabulation matrix obtained 
by overlaying two maps of the same area at two different 
dates. For each category, the changes are characterized in 
four components: gross gains, gross losses, net change and 
swap (Pontius et al. 2004). 

Gross gains are the areas gained by each category, and 
gross losses are the areas lost. Net change is the difference 
between gains and losses. In categories in which gains and 
losses are occurring in different places, swap is a measure of 
the real changes taking place which are not revealed by the 
net change indicator. It measures the total area in which an 
equivalent amount of gains and losses have taken place, i.e. 
if in one category there are gains of 5 ha in one place and 
losses of 3 ha in another, the 3 ha that it losses in one place 
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and recoups in another are the swap (swap = 3 + 3 = 6 ha), 
while the remaining 2 ha (5-3) are the net change. 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


When monitoring landscape changes, the LUCC budget 
technique helps to identify the most critical land use tran- 
sitions and should ultimately facilitate linking patterns to 
process (Pontius et al. 2004). It also allows LUCC simula- 
tion models to compare observed LUCC with simulated 
LUCC in both the calibration and validation steps (Paegelow 
2018). In short, LUCC budget enables a more detailed 
analysis of land use change in a particular area. 


QGIS Exercise 


Available tools 


* Processing R provider plugin 
LUCCBudget.rsx R script 


The components of change computed by the LUCC budget 
are derived from the cross-tabulation matrix. This matrix can 
be obtained by overlaying the two maps in QGIS and then 
calculating the LUCC budget values using a spreadsheet 
programme. However, we suggest using the LUCCBudget. 
rsx R script with the QGIS Processing R provider plugin. 
This script will carry out the entire LUCC budget calculation 
and will generate a table containing the values for the four 
components of change. 

See Chapter “About this Book" for more detailed infor- 
mation about how to integrate R into QGIS and how to use R 
scripts such as the one applied in this exercise. 


Exercise 1. To validate a series of maps with two or 
more time points 


Aim 


To carry out LUCC budget analysis in the Ariege study area 
using the CORINE Land Use maps dated 2000 and 2018. 
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Materials 


CORINE Land Cover Map Val d'Ariége 2000 
CORINE Land Cover Map Val d'Ariége 2018 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


Execution 


If necessary, install the Processing R provider plugin, and 
download the LUCCBudget.rsx R script into the R scripts 
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folder (processing/rscripts). For more details, see Chapter 
" About this Book". 


Step 1 


Then, run the script and fill in the required parameters (names 
of the two maps and the output table) as shown in Fig. 1. 


Results and Comments 


The script will generate the cross-tabulation or change matrix 
as shown in Table 1. This matrix is saved as an intermediate 
product. The script will also generate a table in CSV format 
that indicates, for each category, the value of the four com- 
ponents assessed by the LUCC budget technique (Table 2). 


LUCCBudget x 


Parameters | Log 
| Map date 1 
|| > CLC. 2000 [EPSG:32631] 
| Map date 2 
|| ** CLC 2018 [EPSG:32631] 
Cross tab 


| /home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/datos/change matrix.csv 


Y, Open output file after running algorithm 
Budget tab 


| /home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/datos/LUCC budget.csv 


vi Open output file after running algorithm 


R Console Output 


|| [Save to temporary file] 0% 


0% 


Run as Batch Process... 


> 


LUCCBudget 


/home/jfmas/.local/ 
vll: share/QGIS/QGIS3/ | 

profiles/default/ 

processing/rscripts/ 
A || RS LUCCBudget.rsx 


Ocos | 


— — 


Fig. 1 Exercise 1. Step 1. LUCCBudget R script 
Table 1 Result from Exercise 1. Cross-tabulation or change matrix 
0 1 2 

0 74,437 0 0 

1 0 3,302 3 

2 0 1,853 52,059 

3 0 23 109 

4 0 12 399 

5 0 0 

6 0 0 


5 
0 0 
8 0 37 
235 409 0 
39,232 127 0 
877 11,418 22 
0 10 921 


0 0 0 76 
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Table 2 Results from Exercise 1. LUCC budget components 


Gains Losses 
0 0 0 
1 1,888 48 
2 511 2,499 
3 1,112 258 
4 554 1,310 
5 22 10 
6 38 0 


As can be seen in Table 2, the only class in which there 
are no losses, and consequently no swap is Category 6 
(water). Therefore, for this category, the gross change is 
equal to the net change. Similar behaviour could be expected 
for Category 1 (built-up) because it is a "definitive" class 
(with no return), in the sense that it is very unlikely that a 
built-up area will be converted into another land cover. 
However, the change matrix (Table 1) shows small areas of 
transition from Category 1 (built-up) to Categories 2 (agri- 
culture), 4 (scrublands) and 6 (water). These transitions are 
probably erroneous changes, resulting from misclassifica- 
tions in the maps. The other categories appear to be more 
dynamic with both gross losses and gains and significant 
swap values. 


3 Quantity and Allocation Disagreement 
Description 


Pontius Jr. and Millones (2011) proposed a set of metrics, 
obtained from the cross-tabulation matrix, which classify the 
overall change detected between a pair of maps into various 
components, namely, differences in the quantity of each 
category and differences in their location. 

When analysing a time series (or single maps evaluated 
against a reference map), this method can differentiate 
between the changes that are due to differences in the rela- 
tive importance of certain categories (some increase and 
others decrease) and those derived from changes in the 
location of the elements that make up these categories. It 
also identifies the categories that undergo net changes and 
swaps. As regards differences in location, this method dis- 
tinguishes between exchanges between classes and changes 
in the location of two or more classes. 


157 


Swap Net 

0 0 

96 1,840 
1,023 1,987 
517 854 
1,108 756 
20 12 

0 38 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


Quantity and allocation disagreement assess how similar a 
simulation or simulation is to a reference map, differentiating 
between (dis)agreement that is due to the quantities of dif- 
ferent classes and (dis)agreement caused by the allocation of 
these classes in different places. By providing the same 
information, this method can also be used to validate an 
LUC map against a reference map or to assess the LUC 
changes in a time series of maps and understand whether or 
not these changes follow a logical trend. 


QGIS Exercise 


Available tools 


* Processing Toolbox 
GRASS 
Raster 
r.Cross 
r.kappa 
SAGA 
Confusion matrix 
* Pontius matrix (Excel sheet) 
http://www2.clarku.edu/ ~ rpontius/PontiusMatrix41.xlsx 
* Semi-Automatic Classification plugin (SCP) 
Tab: Postprocessing 
Section: Cross-Classification 


For more information about the use of r.cross, r.kappa, 
SAGA Confusion matrix and SCP, please refer to Chapters 
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"Basic and Multiple-Resolution Cross-Tabulation to Vali- 
date Land Use Cover Maps" and “Metrics Based on a Cross- 
Tabulation Matrix to Validate Land Use Cover Maps". 
QGIS Raster Calculator is a generic tool performing all 
kinds of raster calculations. It is intended for detailed anal- 
ysis of the differences in quantity and allocation, rather than 
global studies. 


Exercise 1. To validate a series of maps with two or 
more time points 


Aim 


To detect quantity and allocation changes between 
CORINE LUC maps of the Ariége Valley (southern France) 
between 2012 and 2018. 


Materials 


CORINE Land Cover Map Val d’Ariége 2012 
CORINE Land Cover Map Val d'Ariége 2018 


Requisites 


All maps must be in raster format with the same resolution, 
extent and spatial reference system (SRS). 


Execution 
Step 1 


In order to be able to make this analysis, the CORINE LUC 
map for 2018 must be polygonized. To this end, use the tool 
Polygonize. 


Step 2 


After polygonizing the CORINE raster, the next stage is to 
cross-tabulate the two maps we are going to compare. To 
this end, open the SAGA confusion matrix tool and select 
the CORINE LUC map for 2012 as Classification 1 layer 
and the CORINE LUC map for 2018 as Classification 2 
layer. Then, fill in the parameters for the following lines— 
Value, Value (Maximum) and Name—into the function. Do 
not change any default options (the "Report unchanged 
classes" box must be ticked; output as "cells" and open the 
results generated) (Fig. 2). Rather than saving these results 
in a file, they can be handled as temporary layers. 
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Step 3 


Import the SAGA-generated confusion matrix obtained in 
the previous stage into a spreadsheet software such as Excel. 
Then translate the obtained matrix into percentages 
(Table 3). This is done by dividing each pixel score in the 
original table by the total number of pixels multiplied by 
100. 


Step 4 


Finally, use the SAGA-generated confusion matrix obtained 
in Step 2 to calculate the quantity and allocation disagree- 
ments in a spreadsheet software such as Excel. For a pixel 
resolution of 15 x 15 m, 1 ha corresponds to 44.44 pixels. 
Quantity disagreement is calculated by subtracting column 
total from row total (quantity disagreement = row total — 
column total) (Table 4). Allocation disagreement corre- 
sponds to all not-diagonal cell values. 


Results and Comments 


Table 3 shows the SAGA-generated confusion matrix 
reformatted in Excel and converted into a per cent of the 
study area. The sum of the diagonal corresponds to the 
overall persistence between 2012 and 2018. This value is 
98.52%, which means that the change rate is 1.48%. 

Although the net balance values (2018—2012) provided in 
Table 4 mask the changes that have taken place in certain 
classes, we can see from Table 3 that built-up gains (1.01%) 
result almost exclusively from the conversion of agricultural 
and pasture land (1.00), whose losses are partially com- 
pensated by the conversion of scrubland into agriculture and 
pasture (0.08). Scrubland is the only category with net losses 
and no net gains. 

Table 4 expresses the amount of change (2018-2012) in 
ha (for a pixel resolution of 15 x 15 m; 1 ha corresponds to 
44.44 pixels). As can be seen, no significant changes took 
place in mineral and water areas, while losses in scrubland 
were matched by gains in forest (about 400 ha) and losses in 
agriculture and pasture were matched by gains in built-up 
areas (about 1,000 ha). 

Allocation disagreement corresponds to all not-diagonal 
cell values. These may be expressed as gains (2018—in- 
tersection 2012 against 2018) and losses (2012—intersection 
2012 against 2018). While in some classes there are net 
changes (e.g. scrubland is the only category with net losses 
and no net gains), the changes in agriculture and pasture land 
are almost all losses (1.05), with just a few small gains 
(0.08%) from scrubland. This means that quantity dis- 
agreement shows a negative net balance for agriculture and 
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Table 3 Result from Exercise 1. Confusion matrix between 2018 and 2012 maps 


2018 

% Built-up Agriculture Forest 

2012 Built-up 3.63 0.00 0.00 
Agriculture 1.00 47.21 0.04 
Forest 0.01 0.03 36.03 
Scrubs 0.00 0.08 0.28 
Mineral 0.00 0.00 0.00 
Water 0.00 0.00 0.00 
Total 2018 4.63 47.32 36.36 
Gains 1.01 0.11 0.33 


Table 4 Result from Exercise 1. Net change (ha) per category 


Quantity disagreement (ha) 2018-2012 
Built-up 1,083.04 
Agriculture —1,037.43 
Forest 322.58 
Scrubs —406.55 
Mineral 4.30 

Water 33.59 


pasture of about 1,037 ha (see Table 4), while allocation 
disagreement shows that more agriculture and pasture land is 
affected with losses of about 1,160 ha (1.04% converted into 
ha) and gains of about 123 ha between 2012 and 2018. 
Unlike allocation disagreement, quantity disagreement hides 
the real amount of land in which changes take place (for 
more details, see Sect. 2). 


4 Figure of Merit (FoM) and Complementary 
Producer's and User's Accuracy 


Description 


The Figure of Merit (Pontius et al. 2008) is a measure that 
examines how simulated change overlaps with a reference 
map of changes. A Figure of Merit of 0% means there is no 
overlap, whereas a Figure of Merit of 10096 means perfect 
overlap. The overlap between real changes and simulated 
changes leads to four possible combinations. These are the 
four components of the Figure of Merit: 


e MISSES (A) = the real maps show change but the sim- 
ulation shows persistence. 

e HITS (B) = the real maps show change and the simulation 
shows change. 


Scrubs Mineral Water Total 2012 Losses 
0.00 0.00 0.03 3.66 0.03 
0.00 0.00 0.00 48.25 1.04 
0.00 0.00 0.00 36.07 0.04 
10.74 0.00 0.00 11.11 0.37 
0.00 0.85 0.00 0.85 0 
0.00 0.00 0.06 0.06 0 
10.74 0.86 0.09 100 

0.00 0.00 0.03 


e WRONG HITS (C) = the real maps show change and the 
simulation shows change but allocates it to the wrong 
category. 

e FALSE ALARMS (D) = the real maps show persistence 
but the simulation shows change. 


The Figure of Merit is calculated via the following ratio of 
the four components: B/(A + B + C + D). 

The overlap between real changes and simulated changes 
also produces a fifth combination: 


e CORRECT REJECTIONS (E) = the real maps show 
persistence and the simulation shows persistence. 


Two complementary measures can be obtained using the 
same components of the Figure of Merit: 


e Producer's accuracy: A measure calculated using the ratio 
B/(A + B + C), which expresses “the proportion of pixels 
that the model predicts accurately as change, given that 
the reference maps indicate observed change" (Pontius 
et al. 2008). 

e User's accuracy: A measure calculated using the ratio B/ 
(B + C + D), which measures the number of pixels that 
the model predicts accurately as change as a proportion of 
all the changes it predicts. 


Utility 


Exercises 


1. To validate simulated changes against a reference map of changes 
2. To validate simulated changes against a reference map of changes 
in a binary format 

3. To validate the changes simulated by various models 


The Figure of Merit and the complementary Producer's and 
User's accuracies are very useful measures for validating the 
change simulated by a model. The different components of 
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the Figure of Merit can give users a better picture of how 
accurate the simulation is, e.g. if the model estimated more 
or less changes than those appearing on the reference 
map. They can also differentiate between quantity and 
allocation errors (Pontius et al. 2018). 

These measures are also highly recommended for com- 
paring several simulations using a standard measure. They 
can be applied, for example, to assess the congruence of 
model outputs. This is a form of validation that evaluates the 
agreement between simulations obtained through different 
models or between simulations obtained using the same 
model but parametrized in different ways. The agreement 
between the simulation maps is measured and the degree of 
congruence is considered an indicator of the stability of the 
model and the plausibility of the simulations. The congru- 
ence of model outputs provides useful information about 
model robustness (Paegelow et al. 2014; Camacho Olmedo 
et al. 2015). 

Complementary analyses to the Figure of Merit and the 
Producer's and User's accuracies include spatial metrics, 
Kappa indices, the Land Use and Cover budget (LUCC 
budget) technique and Quantity and Allocation disagreement. 
These indices are described in Sects. 2 and 3 of this chapter. 


QGIS Exercises 


Available tools 


* Processing Toolbox 
SAGA 
Image analysis 
Confusion matrix (two grids) 
Confusion matrix (polygons/grid) 
Raster analysis 
Cross-classification and tabulation 
* Processing Toolbox 
GRASS 
Raster 
r.cross 
* Semi-Automatic Classification Plugin 
Tab: Postprocessing 
Section: Cross-classification 
Section: Accuracy 
Section: Land cover change 


The Figure of Merit and the complementary Producer's and 
User's accuracy indices are not calculated directly in QGIS. 
Producer's and User's accuracy per category can be calcu- 
lated using the SAGA Confusion matrix (two grids) and 
Confusion matrix (polygons/grid) tools and in the 
"Semi-Automatic Classification Plugin" (Accuracy). 

Users can calculate the Figure of Merit from the 
cross-tabulation matrices. As commented in Sect. 1 in 
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Chapter “Basic and Multiple-Resolution Cross-Tabulation to 
Validate Land Use Cover Maps", QGIS includes many tools 
for cross-tabulating spatial data in the GRASS and SAGA 
toolboxes. The *Semi-Automatic Classification Plugin" also 
includes cross-tabulation tools. 

Of all the tools available in QGIS, in this book, we rec- 
ommend the “Semi-Automatic Classification Plugin", which 
is the most efficient, most stable tool of all those assessed. 


Exercise 1. To validate simulated changes against a 
reference map of changes 


Aim 


To validate the change simulated by a model against a ref- 
erence map of changes for the same simulation period. The 
initial map is the CORINE map for 2005 in both cases. The 
changes from 2005 to 2011 are calculated for the simulation 
and for the CORINE data as reference. 


Materials 


CORINE Land Use Map Asturias Central Area 2005 
CORINE Land Use Map Asturias Central Area 2011 
Simulation LCM Val d'Ariége 2018 


Requisites 


The maps must have the same extent, spatial resolution, 
projection and legend. If they do not have the same legend, 
the maps must be reclassified to meet this requirement. For a 
proper validation, the latest reference map must refer to the 
same date as the simulation. 


Execution 
Step 1 


We begin by obtaining two rasters showing the areas that 
changed in the study area during the period analysed and 
those that remained the same. This procedure must be done 
twice: once for the reference map (CORINE 2005-CORINE 
2011) and once for the simulated map (CORINE 2005- 
Simulation 2011). 

To obtain these maps, open the “Semi-Automatic Clas- 
sification Plugin” and the “Postprocessing” tab. Then select 
Land cover change and fill in the required parameters: the 
earlier map in the reference classification (CORINE 2005) 
and the more recent map in the new classification (CORINE 
2011; Simulation 2011) (Fig. 3). Leave the “Report 
unchanged pixels" option unmarked so as to obtain a map 
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Fig. 3 Exercise 1. Step 1. Semi-Automatic Classification Plugin 


that only shows the areas that changed during the study 
period. If this option is marked, a map showing both change 
and persistence areas will be obtained. 

Run the tool to obtain two output maps showing the 
changes on the reference map (CORINE) and the changes 
simulated by the model. Both will refer to the same period 
(2005-2011). 


Step 2 


The next stage involves cross-tabulating the two maps of 
changes. To obtain these maps, open the Semi-Automatic 
Classification Plugin and in the “Postprocessing” tab, select 
Accuracy. Select the required parameters: classification to 
assess (simulated changes) and reference raster (CORINE 
05-11 changes) (Fig. 4). 


Results and Comments 


Step I produces two maps of changes, which are stored in 
the folder specified by the user. The function also generates a 
matrix for each pair of cross-tabulated maps. These matrices 
appear in the “output” window, stored in CSV format. They 
show each possible combination between the two 
cross-tabulated maps and the code under which each com- 
bination is represented in the output raster. 


Only four transitions (new codes 3, 4, 16 and 17) are 
simulated by the model, as expressed in Table 5. Twenty-eight 
transitions occur between the CORINE maps (Table 2). 

Most of the changes predicted in the simulation refer to 
the transition from agricultural areas (Category 0) to urban 
fabric (Category 2) and to the transition from agricultural 
areas to industrial and commercial areas (Category 3). 
Together, they represent 1,546 of the 1,632 pixels simulated. 
That is, almost 95% of the simulated pixels. In the reference 
map, these transitions represent 751 and 503 pixels, 
respectively, a less significant proportion of total change (in 
italics in Table 6). 

After completing Step 2, we now have a cross-tabulation 
raster and a table showing every possible combination 
between the two cross-tabulated maps (Table 7). 

Following the definitions provided by Pontius et al. 
(2008), in our case, HITS were only obtained in new codes 
12 (old code 3 in the CORINE map of changes and old code 
3 in the simulated map of changes), 18 (old codes 4 and 4) 
and 55 (old codes 17 and 17). HITS are obtained when both 
the reference map and the simulation show the same change 
or transition, which is why they both have the same codes. 

The WRONG HITS correspond to combinations where 
both the reference map and the simulation show change, but 
to different gaining categories. For example, new code 13 
(old codes 3 and 4) refers to areas that were agricultural 
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Fig. 4 Exercise 1. Step 2. Semi-Automatic Classification Plugin 


Table 5 Result from Exercise 1. Variety and size of the simulated 
transitions 


New codes CORINE 05 Simulation Pixel sum 
category category 
0 2 874 

4 0 3 672 

16 1 2 38 

17 1 3 48 


Table 6 Result from Exercise 1. Size of transitions between 
CORINE 2005 and CORINE 2011 maps 


New codes CORINE 05 CORINE 11 Pixel sum 
category category 

2 0 1 374 
3 0 2 751 
4 0 3 503 
5 0 4 148 
6 0 5 11 
7 0 6 301 
10 0 9 132 
14 1 0 588 
16 1 2 61 
17 1 3 82 
18 1 4 157 
19 1 5 109 
20 1 6 225 
24 1 10 180 
27 2 0 21 


(continued) 


Accuracy assessment 


Select the classification to assess 


Changes C05 Simulation1.tif 


Select the reference vector or raster Changes C05 C11.tif 


Vector fiel 


Table 6 (continued) 


New codes CORINE 05 CORINE 11 Pixel sum 
category category 
28 2 1 22 
30 2 3 26 
36 2 9 4 
40 3 0 51 
42 3 2 11 
53 4 0 211 
54 4 1 327 
55 4 2 89 
56 4 3 21 
79 6 0 44 
80 6 1 111 
147 11 3 88 
151 11 7 657 


areas that changed to urban fabric in the simulation and to 
industrial and commercial areas in the reference map (Tables 
5 and 6). 

FALSE ALARMS refer to areas that are marked as per- 
sistence in the reference map and as change in the simula- 
tion. Examples include new code 2 (old codes 0 and 3). 
Areas with that code refer to pixels that were simulated as 
urban fabric in the simulation, but do not show change in the 
reference map. Code 0 does not appear among the codes in 
Table 6 summarizing all the possible transitions between the 
original (CORINE 2005) and the reference map (CORINE 
2011). It must therefore refer to persistence. 
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Table 7 Result from Exercise 1. (Dis)agreement between the simulated changes and the changes in the reference maps classified in 
five categories: misses, hits, wrong hits, false alarms and correct rejections 


New codes 
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Interpretation 
CORRECT REJECTION 
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FALSE ALARMS 
FALSE ALARMS 
FALSE ALARMS 
MISSES 

MISSES 

HITS 

WRONG HITS 
MISSES 
WRONG HITS 
HITS 

MISSES 

MISSES 

MISSES 
WRONG HITS 
WRONG HITS 
MISSES 
WRONG HITS 
MISSES 
WRONG HITS 
MISSES 

MISSES 

HITS 

MISSES 

MISSES 

MISSES 

MISSES 

MISSES 
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MISSES 

MISSES 
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MISSES 
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MISSES 
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MISSES 

MISSES 

MISSES 

MISSES 
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MISSES refer to the areas where the reference map shows 
change but the simulation shows persistence. Examples 
include code 16 (old code 4 and 0) Finally, 
CORRECT REJECTION refers to the pixels marked as 
persistence in the reference map that were correctly simu- 
lated as persistence (new code 1, old codes 0 and 0). 

In total, HITS account for 347 pixels, WRONG HITS for 
89 pixels, FALSE ALARMS for 1,196 pixels and MISSES 
for 4,869 pixels (Table 7). Therefore, the simulation pro- 
duced a lot more FALSE ALARMS than HITS and the vast 
majority of the predictions were MISSES. This makes sense 
because most of the landscape remained unchanged over the 
simulation period. 

With all the above information, we can finally calculate 
the Figure of Merit (B/(A + B + C + D)) for the model. It is 
5.340%. This is a very low Figure of Merit, far below the 
100% that would mean perfect overlap. However, perfect 
overlap is almost impossible. In most cases, low Figures of 
Merit are the norm. 

We must also consider that the Figure of Merit compares 
the simulated changes with all the changes in the reference 
map. In our simulation, we only modelled two categories 
actively (urban fabric and industrial and commercial areas). 
This means that the changes in all the other categories were 
not even simulated and no agreement can therefore be 
expected. This limitation must be borne in mind when 
evaluating the Figure of Merit. 

The best way to obtain a Figure of Merit that offers 
objective information about the validity of our modelling 
exercise is to repeat the same exercise, focusing exclusively 
on the actively modelled transitions (from agricultural and 
vegetation areas to urban fabric and industrial and com- 
mercial areas). 

Producer’s accuracy (BA + B + C)) is 6.54% and 
expresses the number of pixels that the model accurately 
predicts as change as a proportion of total observed change. 
For its part, User’s accuracy (B/(B + C + D)) measures the 
number of pixels that the model predicts accurately as 
change as a proportion of total predicted change, in this case 
21.26%. 

As regards the four simulated changes, shown in Table 5, 
the Producer’s and User’s accuracy values for Categories 3 
and 4 are higher than for Category 17, and are zero in 
Category 16 (Table 8). 


Table 8 Results from Exercise 
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Exercise 2. To validate simulated changes against a 
reference map of changes in a binary format 


Aim 


To validate the change simulated by a model against a ref- 
erence map of changes for the same simulation period. To do 
this, we overlay two maps that show change versus 
non-change over the same period. The initial map in both 
cases is the CORINE dataset for 2005. The changes from 
2005 to 2011 are calculated for the simulation and for the 
CORINE dataset as reference. In this exercise we do not 
evaluate the WRONG HITS. 


Materials 


CORINE Land Use Map Asturias Central Area 2005 
CORINE Land Use Map Asturias Central Area 2011 
Simulation CORINE Asturias Central Area 2011 


Requisites 


The maps must have the same extent, spatial resolution, 
projection and legend. If they do not have the same legend, 
the maps must be reclassified so as to meet this requirement. 
For a proper validation, the latest reference map must refer to 
the same date as the simulation. 


Execution 
Step 1 


The first step is to obtain two rasters showing the areas that 
changed and those that remained the same over the period 
being analysed: one for the reference map (CORINE 2005- 
CORINE 2011) and one for the simulation (CORINE 2005- 
Simulation 2011). To obtain these maps, follow the 
instructions in Exercise 1 Step 1 above. 


Step 2 


Once the two maps have been obtained, they must be 
reclassified into binary format, i.e. into a map with two 
possible values: O (persistence) and 1 (changes). This is done 
using the Reclassify by table tool. 


Categories in changes 3 4 17 16 

1. Producer's and User's accuracy gj mulation 

values 
Producer’s accuracy % 27.1638 27.2366 7.3171 0.000 
User’s accuracy % 23.3410 20.3869 12.5000 - 
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Bl Change areas 


Fig. 5 Exercise 2. Step 2. Intermediate map showing the areas of 
change in the reference maps 


Figures 5 and 6 show the change areas (value 1) in black 
and the persistence areas (value 0) in white, for both the 
reference map (Fig. 5) and the simulation (Fig. 6). 


Step 3 


Finally, the two binary maps must be cross-tabulated. To do 
so, open the “Semi-Automatic Classification Plugin" and, in 
the “Postprocessing”? tab, select the Cross-classification 
option. Fill in the required parameters: classification (binary 
changes from the simulation) and reference raster (binary 
changes from CORINE) (Fig. 7). 


Results and Comments 


Once we have completed Step 3, the QGIS creates an output 
raster that shows all possible combinations between the two 
binary change maps. The function also generates a table 
showing all possible combinations between the two input 
maps. This table appears in the “output” window, stored in 
CSV format. This table also lists the codes with which each 
combination is represented in the output raster. 
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Simulation 
C] Persistence areas 
Bl Change areas 


Fig. 6 Exercise 2. Step 2. Intermediate map showing the areas of 
change in the simulation 


Table 9 presents the four possible combinations obtained 
from the two binary maps crossed in Step 3. As 0 was used 
to represent persistent areas and 1 areas that changed, new 
code 1 (0/0) refers to pixels that the model correctly simu- 
lated as persistence (CORRECT REJECTIONS). New code 
4 (1/1) refers to pixels that the model correctly simulated as 
change (HITS), while codes 2 and 3 refer to pixels in which 
the model does not agree with the reference map. Code 2 
(0/1) corresponds to FALSE ALARMS: the model simulated 
change but the reference map shows persistence. Code 3 
(1/0) stands for MISSES: the model simulated persistence 
but the reference map shows change. 

The sum of MISSES plus HITS (5,305 pixels) represents 
the change in the reference map (CORINE) for the period 
2005-2011. These pixels cover just 0.907796 of the total 
study area. Very little change therefore took place in the 
reference map for our study area. 

HITS plus FALSE ALARMS (1,632 pixels) gives all the 
pixels in which the simulation predicted change. These 
pixels cover 0.2792% of the total study area. This means that 
fewer changes were simulated than actually took place on 
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Fig. 7 Exercise 2. Step 3. Semi-Automatic Classification Plugin 


Table 9 Result from Exercise 1. (Dis)agreement between the simulated changes and the changes in the reference maps classifiedin five categories: 


misses, hits, wrong hits, false alarms and correct rejections 


New codes Binary CORINE changes Binary simulated changes Pixel sum Interpretation 

1 0 0 571,949? CORRECT REJECTIONS 
2 0 1 1,196 FALSE ALARMS 

3 1 0 4,869 MISSES 

4 1 1 436 HITS 


the reference map. This makes sense given that in our sim- 
ulation we only simulated the transitions from agricultural 
and vegetation areas to urban fabric and industrial and 
commercial areas, while the reference map also considered 
many other changes between all the other categories repre- 
sented on the map, which were not simulated in our mod- 
elling exercise. 

The Figure of Merit (B/(A + B + C + D)) for our simu- 
lation is very low at 6.7%. This indicates that the simulation 
did not simulate most of the changes that took place in the 
reference map correctly. This is partly due to the fact that we 
only actively modelled two categories, while the reference 
map showed the changes that took place between all cate- 
gories. As a result, overlap between the two maps is 
impossible in many areas. Even so, the general level of 
overlap between the simulated changes and those observed 
on the reference maps is still quite low. Other metrics and 
tools must therefore be used in order to interpret the simu- 
lation and the performance of the modelling exercise better. 

The Figure of Merit in this exercise is a bit better than in 
the previous one because we did not take WRONG HITS 
into account. In this case, we only compared changes, 


without taking into account the type of change that happened 
in the simulation period. 


Exercise 3. To validate the changes simulated by 
various models 


Aim 


To compare and validate the change simulated by two 
models. For this purpose, we overlay three maps that show 
change versus non-change over the same interval. The initial 
map in all cases is the CORINE dataset for 2005. The 
changes from 2005 to 2011 are calculated for the simulation 
from model 1, for the simulation from model 2 and for the 
CORINE dataset as reference. WRONG HITS are not 
evaluated in this exercise. 


? There are 339,103 pixels of no data. If we subtract them from the 
917,052 pixels coded as 1, the result is 577,949 pixels in which there 
were CORRECT REJECTIONS. 
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Materials 


CORINE Land Use Map Asturias Central Area 2005 
CORINE Land Use Map Asturias Central Area 2011 
Simulation CORINE Asturias Central Area 2011 
Simulation CORINE 2 Asturias Central Area 2011 


Requisites 


The maps must have the same extent, spatial resolution, 
projection and legend. If they do not have the same legend, 
the maps must be reclassified so as to meet this requirement. 
For a proper validation, the latest reference map must refer to 
the same date as the simulation. 


Execution 
Step 1 


The first step is to obtain three rasters for the study area 
showing the areas that changed and those that remained the 
same over the period being analysed. In this way, we obtain: 
(i) the map of changes for the reference map (CORINE 
2005-CORINE 2011), (ii) the map of changes for the first 
simulation (CORINE 2005-Simulation 1 2011) and (iii) the 
map of changes for the second simulation (CORINE 2005- 
Simulation 2 2011). 

To obtain these maps, open the “Semi-Automatic Clas- 
sification Plugin" and, in the “Postprocessing” tab, select 
Land cover change. Then, fill in the required parameters: the 
earliest map in the reference classification (CORINE 2005) 
and the more recent maps in the new classifications 


e Semi-Automatic Classification Plugin 


# Band set ^ 


& Basic too! Input 
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(CORINE 2011, Simulation 1 2011, Simulation 2 2011). 
The three output maps will show the change areas and 
the persistence areas for each of the three maps (the refer- 
ence CORINE map and the two simulations) under 
consideration. 


Step 2 


Once these three maps have been obtained, they must be 
reclassified into binary maps in which persistence areas are 
reclassified as O and change areas as 1. The maps are 
reclassified using the Reclassify by table tool. 


Step 3 


The three binary maps must then be cross-tabulated, so as to 
be able to assess the congruence between the simulations 
and the reference map. 

To do this, open the “Semi-Automatic Classification 
Plugin" and the “Postprocessing? tab, and then select 
Cross-classification. Start by cross-tabulating the two sim- 
ulations you want to compare. To this end, fill in the fol- 
lowing parameters: classification (binary map of changes 
from simulation 1) and reference raster (binary map of 
changes from simulation 2) (Fig. 8). 


Step 4 
The procedure is repeated again, this time cross-tabulating 
the raster obtained in the previous step with the reference 


map. In this case, open the tool and fill in the parameters as 
follows: classification (raster obtained after running the tool 


= m x 


FS Accuracy | BA Land cover chanae | FF Classification reoort  Bi:ross classificatior | & Class sianature | IM Classi * > 


& Downloa: 
Select the classification Changes C05 FirstS11 binary v |o 
[- Preproce: 
Use NoData value 0 = 

Band 
> Ban ne Select the reference vector or raster Changes C05 SecondS11 binary Y"» o 
QPostproc - Vector fiek Y 

1; [Run 
Band calc 
RUN $ ~ 
$Ñ Batch 
Output 

X Settings "| 


Fig. 8 Exercise 3. Step 3. Semi-Automatic Classification Plugin 
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[e Semi-Automatic Classification Plugin 


169 


» ^ 
&*Bandset ^| f Accuracy | ĪA Land cover chanae | IF Classification report Bii:ross classificatior | ^ Class sionature | IM Classi « > 
& Basic too! Input 
$ Downloa | Cross classification AA 
Select the classification Changes C05 FirstS11 SecondS11 binary.tif Y» Oo 
[- Preproce: 
Use NoData value 0 = 
Band 
> Ban po Select the reference vector or raster Changes C05 C11 binary vio 
GPostproc - Vector fiek M 
-" [Run 
&3 Band calc 
RUN $ . 
$Ñ Batch 
Output 
Settings "| 


Fig. 9 Exercise 3. Step 4. Semi-Automatic Classification Plugin 


as explained in the previous step) and reference raster 
(CORINE 05-11 binary map of changes) (Fig. 9). 


Results and Comments 


After carrying out Steps 3 and 4, QGIS creates two output 
rasters. The function also generates a table for each raster, 
which appears in the “output” window in CSV format. This 
table shows every possible combination between the values 
of the cross-tabulated maps. It also lists the codes under 
which each combination is represented in the output raster. 

The raster obtained in Step 3 measures the agreement 
between the two simulations (Table 10). In the binary maps, 
O was used to refer to persistent areas whereas | referred to 
areas that changed. New code 1 (previous codes 0/0) 
therefore refers to the pixels in which both models predicted 
persistence, while new code 4 (1/1) refers to the pixels where 
both models predicted change. Finally, new codes 2 and 3 


Table 10 Results from Exercise 


: New Binary changes 
3. (Dis)agreement between the codes from simulation 1 
changes in the two simulations 
that have been compared 1 0 
2 0 
3 1 
4 1 


represent areas in which the simulations do not agree: one 
shows persistence, whereas the other shows change. 

The raster obtained in Step 4 was produced by 
cross-tabulating a reference change map with the raster 
obtained after cross-tabulating the change maps produced by 
the two simulations. This cross-tabulation therefore produces 
eight possible combinations (Table 11). 

In order to interpret the results of this second 
cross-tabulation correctly, we need to understand the values 
of the two rasters that were cross-tabulated. In the reference 
change map, 0 refers to persistent areas and 1 to areas that 
changed during the period under consideration. The mean- 
ings of the new codes in the raster obtained in Step 3 are 
detailed in Table 9. 

This enables a better interpretation of the results of the 
last raster generated. New code 1 (previous codes 0/1) refers 
to areas in which persistence was observed on the reference 
map of changes (code 0) and was also simulated by the two 


Binary changes Pixel Interpretation 

from simulation 2 sum 

0 581,158° | Both models predicted persistence 

1 64 First model predicted 
persistence/Second model predicted 
change 

0 1,660 First model predicted 
change/Second model predicted 
persistence 

1 1,568 Both models predicted change 


? There are 339,103 pixels of no data. If we subtract them from the 
920,261 pixels coded as 1, the result is 581,158 pixels. 
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Table 11 


New codes Binary changes 


CORINE 
models 1 and 2 


1 0 1 


models (code 1) (see Table 9 to understand the meaning of 
this code). Those cases in which the two models and the 
reference map all simulated persistence are referred to as 
DOUBLE REJECTIONS (Camacho Olmedo et al. 2015). 

New code 4 (previous codes 0/4) refers to areas where the 
two models simulated change (code 4) and the reference 
change map showed persistence. These are known as 
DOUBLE FALSE ALARMS. 

New code 5 (1/1) corresponds to areas where both models 
simulated persistence and the reference map showed change 
(DOUBLE MISSES). New code 8 (1/4) refers to areas where 
the two models and the reference map also showed change 
(DOUBLE HITS). Finally, the other four combinations refer 
to areas where each simulation shows a different agreement 
with the reference map (Table 11). 

These eight possible combinations are expressed as two 
maps. The first map (a zoomed area is shown in Fig. 10, on 
the left) shows the four possible combinations for the areas 
on the CORINE map in which persistence was observed. 
Pixels simulated — as persistence are therefore 


Cross-tabulation from binary | Pixel sum 
changes simulation from 
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Results from Exercise 3. (Dis)agreement between the changes in the two simulations and the changes in the reference maps 


Interpretation 


576,588" DOUBLE CORRECT 
REJECTION 
Both models correctly predicted 


persistence 


54 CORRECT REJECTION/FALSE 
ALARMS 
First model correctly predicted 
persistence/Second model wrongly 
predicted change 


FALSE ALARMS/CORRECT 
REJECTION 

First model wrongly predicted 
change/Second model correctly 
predicted persistence 


DOUBLE FALSE ALARMS 
Both models wrongly predicted 
change 


DOUBLE MISSES 
Both models wrongly predicted 
persistence 


10 MISSES/HITS 
First model wrongly predicted 
persistence/Second model 
correctly predicted change 


HITS/MISSES 

First model correctly predicted 
change/Second model wrongly 
predicted persistence 


DOUBLE HITS 
Both models correctly predicted 
change 


1,361 


1,142 


4,570 


299 


426 


CORRECT REJECTIONS, while those simulated as change 
areas are FALSE ALARMS. The areas that changed are 
masked in white. The second map (a zoomed area is shown 
in Fig. 11, on the right) shows the four possible combina- 
tions for the areas on the CORINE map in which change was 
observed. Pixels simulated as change are HITS, while those 
simulated as persistence are MISSES. The persistence areas 
are masked in white. 

According to all the above results, it seems that the two 
simulations are very similar in terms of predictive accuracy. 
The vast majority of the pixels on the map are 
DOUBLE CORRECT REJECTIONS, which means that 
both models are very accurate when predicting persistence. 
This makes sense in that persistence is very easy to simulate 
in a highly stable area like the one we simulated. The most 
challenging task is to correctly simulate change. The best 


^ There are 339,103 pixels of no data. If we subtract them from the 
915,691 pixels coded as 1, the result is 576,588 pixels classified as 
DOUBLE CORRECT REJECTIONS. 
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[9] DOUBLE CORRECT REJECTION 
Both models predicted persistence and it 
persisted 


[Emi] FALSE ALARMS/CORRECT REJECTION 


First model predicted change and it persisted 
/ Second model predicte persistence and it 
persisted 


[99] CORRECTREJECTION/FALSE ALARMS 


First model predicted persistence and it 
persisted / Second model predicted change 
and persisted 


[77] DOUBLE FALSE ALARMS 


Both models predicted change and it 
persisted 


Fig. 10 Result from Exercise 3. (Dis)agreement between the 
simulations and the reference maps for the areas where persistence 
was observed 
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O X 1000m 
— 


[9] DOUBLEHITS 


Both models predicted change and it 
changed 


BN vissEs /HiTS 


First model predicted persistence and it 
changed / Second model predicted change 
and it changed 


[3] HITS / MISSES 


First model predicted change and it 
changed / Second model predicted 
persistence and it changed 


[7] DOUBLE MISSES 


Both models predicted persistence and it 
changed 


Fig. 11 Result from Exercise 3. (Dis)agreement between the simu- 
lations and the reference maps for the areas where change was observed 
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models are therefore those that simulate change most 
accurately. 

If we focus exclusively on the areas that changed, the 
accuracy is very low. 86.1451% of the pixels were 
DOUBLE MISSES, while in the remaining pixels there were 
HITS in one or both models. This means that in the vast 
majority of cases, our models incorrectly simulated change. 
These simulations cannot therefore be validated, although 
other validation tools can be used to check whether the 
simulated pattern is valid. In this regard, even if a hard 
comparison does not show a high level of agreement between 
a simulation and the reference map, the pattern of the simu- 
lated changing areas may be logical or correct. The models 
can therefore be considered valid in a qualitative sense. 


5 Incidents and States 
Description 


Incidents and states are terms proposed by Pontius Jr. et al. 
(2017) to characterize land use cover changes in a series of 
three or more maps. States refer to the number of land uses 
or land covers a pixel is assigned in the series of maps. There 
can be as many states as there are maps in the series. Hao 
and Gen-Suo (2014) used the term “land use classification 
variety" for this metric when applying it to validate Land 
Use Cover maps (MODIS Land Cover product). 

Incidents refer to the number of times a pixel changes 
category over the course of a time series. There can be as 
many incidents as there are stages in the time series. In a 
series of 5 maps, there are 4 time-stages. The series may 
therefore have between 0 and 4 incidents, i.e. the pixel may 
change category between O and 4 times. The number of 
incidents can also be referred to as "Transition frequency". 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


The number of incidents and states assigned to the pixels in a 
time series of Land Use Cover maps can help us identify the 
changes that take place for technical reasons, i.e. erroneous or 
spurious changes which do not really happen on the ground. 

When obtained from satellite imagery classification, Land 
Use Cover maps usually have important sources of uncer- 
tainty. Various different Land Use and Cover categories can 
have very similar levels of reflectance. If the imagery is 
obtained at different times of the year, or under different 
atmospheric conditions, the reflectance of a pixel can vary to 
a similar extent to the difference in reflectance between two 
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Land Use Cover categories. The same pixel could therefore 
be classified under different categories over the course of the 
time series. The number of incidents and states of the pixel 
can potentially help us to identify these errors. 

For example, in a time series of six maps, if a pixel has 
five incidents, but only two states, it means that it alternates 
between these two categories at each stage in the time series. 
If we discover which categories are involved in the transi- 
tions we can determine to what extent these changes are 
logical. Incidents and states can also be used to validate a 
series of simulations, when working with modelling exer- 
cises to obtain scenarios for more than two time points. 


QGIS Exercise 


Available tools 


* Processing Toolbox 
GRASS 
Raster 
r.series 


The GRASS toolbox associated with QGIS has a tool for 
calculating the number of states in a time series of Land Use 
Cover maps. QGIS does not provide any specific tool to 
calculate the number of incidents in the time series, so this 
metric must be calculated manually. This is done using the 
raster calculator and a raster reclassification tool. 

QGIS offers several raster calculators and reclassification 
tools. Although they are all valid, in this exercise we will be 
using the ones from the core QGIS toolbox. 

Pontius et al. (2017) also developed a tool in Excel to 
automatically calculate the incidents and states of a series of 
Land Use Cover raster maps in .rst format. It is available 
online free of charge.’ 


Exercise 1. To validate a series of maps with two or 
more time points 


Aim 


To find out if technical changes may have taken place in the 
last series of CORINE Land Cover maps produced for the 
Asturias Central Area. 


? The tool is available on R. G. Pontius Jr's personal website: http:// 
www2.clarku.edu/ ~ rpontius/. 
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Materials 

CORINE Land Use Map Asturias Central Area 2005 
CORINE Land Use Map Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2018 


Requisites 


All maps must be rasters and have the same resolution, 
extent and projection. 


Execution 
Step 1 


In order to calculate the number of states per pixel, we must 
open the r.series tool and select all the maps that form part of 
the series of Land Use Cover maps we are analysing ("Input 


Q rseries 


Parameters Log 
Input raster layer(s) 


3 elements selected 
L] Propagate NULLs 
Aggregate operation [optional] 


1 options selected 
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raster layer(s)"). In this case, we select the three maps in our 
series: CORINE Land Cover 2005, 2011 and 2018. 

In the “Aggregate operation [optional]" option, select 
"Diversity". This will count the number of different cate- 
gories to which a pixel is assigned over the course of the 
time series. 

In *Advanced parameters", indicate the range of values of 
the Land Use Cover maps introduced as input, i.e. the min- 
imum and maximum values. In our case, the minimum value 
for a category is 0 and the maximum value is 12 (Fig. 12). 

The final stage is to indicate where the new map will be 
saved. 


Step 2 
There is no specific tool for calculating the number of 


incidents in a pixel over the course of a time series. This 
operation must therefore be carried out manually. The first 


A 


Quantile to calculate for method 2quantile [optional] 


LDLo Ee 


Weighting factor for each input map, default value is 1.0 [optional] 


PO 


wv Advanced parameters 


Ignore values outside this range (lo,hi) [optional] 


Min Max |12,000000 E 


GRASS GIS 7 region extent (xmin, xmax, ymin, ymax) [optional] 


[Leave blank to use min covering extent] 


GRASS GIS 7 region cellsize (leave 0 for default) 
0,000000 


Fig. 12 Exercise 1. Step 1. R.series 
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Q Raster Calculator 


"C11@1" - "C05@1" 


Predefined expressions 


Reference layer (s) (used for automated extent, cellsize, and CRS) [optional] 
1 elements selected | A 


Q Raster Calculator 


Reference layer(s) (used for automated extent, cellsize, and CRS) [optional] 
1 elements selected [E] 


Fig. 13 Exercise 1. Step 2. Raster Calculator 
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step is to identify where the changes happened. For each 
pixel, we must then calculate the number of times it 
underwent change (or not). To carry out these operations, we 
have to work with pairs of maps: first 2005 and 2011 and 
then 2011 and 2018. 

To identify where the changes happened, for each pair of 
rasters we must subtract one raster from the other. If a pixel 
does not change, the result of the subtraction will be a value 
of O for that pixel. If the pixel changes, the result of the 
subtraction will be a value other than 0. 

The subtraction operation is carried out using the Raster 
calculator, in which we must write the following subtraction 
expression for each pair of maps: 


t2 map — tl map 


We also need to indicate which raster is the reference map 
that will be used to define the characteristics (extent, spatial 
resolution and projection) of the new raster obtained after the 
calculation. In this case, we will be using the first map in our 
series (CORINE 2005). This must be indicated in the 
"Reference layer(s) (used for automated extent, cell size and 
CRS) [optional]" option (Fig. 13). 


Step 3 


Once the previous step has been completed, the maps 
obtained must be reclassified to enable us to identify the 
pixels where an incident took place (values other than 0) and 
the pixels that were incident-free at each stage (0 values). 
To identify all pixels in which incidents took place with a 
value of 1, we reclassify all values other than O as 1 using the 
Reclassify by table tool (Fig. 15). The first stage in the 
reclassification process is to indicate the two rasters that 
must be reclassified. Then, detail the reclassification criteria 


Q Fixed table 
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using the “Reclassification table" option. In the window that 
opens for selecting the reclassification criteria, add two rows 
using the “Add row" button. Then, introduce the following 
values (Fig. 14): 

That means that all values between —999 and —1 will be 
reclassified with the value 1. The same will be true for all 
values between 1 and 999. If as a result of the raster sub- 
traction we get bigger negative values than —999 or bigger 
positive values than 999 we will need to adjust the values in 
the reclassification table accordingly. 


Step 4 


The last step is to count the number of incidents for each 
pixel over the course of the time series. This is done using 
the Raster calculator, which adds together the rasters we 
reclassified in the previous step using the following 
expression: 


Incidents, C05. C11 + Incidents. C11, C18 (Fig. 5) 


The CORINE 2005 map will be used as a reference to 
define the characteristics of the output raster (Fig. 16). 


Results and Comments 


After completing all the operations described above, two 
different maps will be obtained: one with the number of states 
per pixel and another with the number of incidents per pixel. 

The above maps (Fig. 17) show the number of incidents 
and states for a specific part of the Asturias Central Area. 
Most of the areas that change over the period 2005-2018 
underwent just one LUCC transition (one incident and one 
state). However, we discovered a couple of cases in which 
there were two incidents and two states. This means that, for 


Maximum 


Value Add Row 


1 
1 Remove All 
OK 


Cancel 


Fig. 14 Exercise 1. Step 3. Reclassification table of the Reclssify by Table tool 
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Q Reclassify by Table 


Parameters — Log | 
Raster layer 
| C11 minus C05 [EPSG:32630] — v|[&e] 
Band number 
‘Band 1 (Gray) x] 
Redassification table 

Fixed table (2x3) 

> Advanced parameters 
Redassified raster 


C:/Users/David/Desktop/LU exercises/Incidents C05 C11.tif A 


[7] Open output file after running algorithm 


Q Reclassify by Table 


Parameters Log 
Raster layer 

f=" C18 minus C11 [EPSG:32630] v| [ae 
Band number 
Band 1 (Gray) v 
Redassification table 

Fixed table (2x3) 
> Advanced parameters 
Redassified raster 
m 


[7] Open output file after running algorithm 


Fig. 15 Exercise 1. Step 3. Reclassify by Table 
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Q Raster Calculator 


Parameters 


Expression 


Log 


Layers Operators 


C05801 ^ + * 
C11@1 
C18@1 : / 
Incidents, 
Incidents, ,, 


< > < > 


Expression 


"Incidents _C05_C11@1" + “Incidents_C11_C18@1" 
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cos sin log 10 AND 

acos asin In OR 

tan atan ( ) 
= I= <= >= 


Predefined expressions 


NDVI 


Reference layer(s) (used for automated extent, cellsize, and CRS) [optional] 


1 elements selected 


Fig. 16 Exercise 1. Step 4. Raster Calculator 


the 3 years analysed (2005, 2011 and 2018), there were two 
changes or transitions, but these only involved two land uses 
or covers. In other words, the area changed from its original 
land use or cover in 2005 to a different one in 2011 and then 
reverted to the original in 2018. 

If we refer back to the original maps, we can identify the 
transitions that took place. The changing area on the right 
(1) (Fig. 17) underwent a transition from “Agricultural 
areas” in 2005 to “Urban fabric” in 2011 and then changed 
back to “Agricultural areas” in 2018. It is highly unlikely 
that an agricultural cover could change to an artificial cover 


and then revert to its original state a few years later. It must 
therefore have been an error (technical or spurious change). 

The changing area on the left (2) (Fig. 17) underwent a 
transition from “Agricultural areas” in 2005 to “Vegetation 
areas” in 2011, before changing back to “Agricultural areas” 
in 2018. This transition, although unlikely, seems more 
logical. So, before labelling it as an error or technical 
change, we should confirm whether these changes really 
took place in the area in question during the timeframe 
analysed. This can be done by photointerpretation of aerial 
imagery. 
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Number of incidents 


E : 
E - 


Number of states 
E: 


| | 2 
EB : 


Fig. 17 Result from Exercise 1. Number of incidents and states for an example area of the Asturias Central Area 


6 Intensity Analysis 
Description 


Intensity analysis, proposed by Aldwaik and Pontius (2012), 
enables us to assess the rate or intensity at which change 
takes place during each time interval in a time series of LUC 
maps. It also helps identify apparently random or uniform 
processes. It is a three-stage analysis process, which identi- 
fies: (1) periods of relatively slow/fast change; (ii) relatively 
dormant/active land use categories and (iii) the transitions 
that are actively avoided/targeted by a given land use cate- 
gory. À series of maps with three or more time points are 
needed for this analysis. 

During the first stage of this process, the overall rate of land 
use change over each time interval is analysed to assess 
whether change was relatively fast or slow. To this end, the 
average annual rate of change for each time interval is com- 
pared with the average annual rate of change for the whole 
period. 

The second stage analyses the intensity of change at 
category level within each time interval relative to the 


overall change rate for the interval calculated in stage one. It 
measures the gross losses and gross gains in area for each 
category so as to analyse whether the category shows a 
similar, stable pattern across the various time intervals in 
terms of the intensity of gains and losses. These observed 
intensities for each category are compared with an average 
annual rate of gains/losses that would exist if the changes 
within each interval were distributed uniformly over the 
entire time interval. This shows which categories are rela- 
tively dormant or active. 

The final stage is at transition level. It examines the intensity 
of a particular transition over a given time interval, taking into 
account the different sizes of the categories and relative to the 
results of the category-level analysis. The gains made by a 
specific category may vary in size and intensity among the 
different categories from which it makes these gains. By com- 
paring the observed rate of gains from each category with a 
uniform rate of gains that would exist if the gains were made 
uniformly from among all the available categories, we can 
identify those categories that are intensively avoided or tar- 
geted. Losses can be analysed in a similar way. 

Intensity analysis also allows us to determine whether a 
particular transition occurs at a stable rate or occurs more 
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intensely over a particular time interval within the series. If 
the same category is targeted (or avoided) over all the dif- 
ferent time intervals, then this transition is said to be 
stationary. 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


Intensity analysis analyses the size and intensity of land 
changes. It also checks for stationarity and takes the relative 
size of the categories into account, rather than just the 
absolute gains or losses they may undergo. 

At the interval level, users can identify how quickly or 
slowly LUC change is taking place during each time interval 
as compared to the average annual rate of change over the 
whole time series. At the category level, intensity analysis 
allows users to identify which categories are dormant versus 
active in terms of gains or losses in the size of each category. 
At the transition level, when a given category makes gains or 
losses, users can identify which other categories are most 
intensively targeted or avoided. 


QGIS Exercise 


Available tools 


* Aldwaik and Pontius matrix (Excel sheet) 
https://sites.google.com/site/intensityanalysis/ 
* R Package /ntensity.analysis 

* Processing R provider Plugin 

Intensity analysis.rsx R script 


There is not any specific tool available in QGIS to make 
intensity analysis, although this has been implemented in an 
R package (intensity.analysis) (Pontius and Khallaghi, 
2019). Based on this package, we have developed an R 
script that allows to integrate this analysis in QGIS. This 
package will carry out the entire analysis and will generate 
three tables containing the results at each level of analysis 
(overall, category and transition) and a plot showing the 
results at the interval level. 

See Chapter “About this Book" for more detailed infor- 
mation about how to integrate R into QGIS and how to use R 
Scripts such as the one applied in this exercise. 
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Exercise 1. To validate a series of maps with two or 
more time points 


Aim 


To study land change in the Ariége study area using the 
CORINE Land Use maps dated 2000, 2012 and 2018. The 
results of this exercise can also be used to validate land 
change. 


Materials 


CORINE Land Cover Map Val d'Ariége 2000 
CORINE Land Cover Map Val d’Ariége 2012 
CORINE Land Cover Map Val d’Ariége 2018 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


Execution 


If necessary, install the Processing R provider plugin and 
download the /ntensity analysis.rsx R script into the R 
scripts folder (processing/rscripts). See Chapter “About this 
Book" of this book for further information about how to use 
the QGIS R script. 


Step 1 


The land use maps need to be stacked into a multilayer file in 
chronological order. The first map is the oldest map. The 
second map is the next oldest and so on. This can be done 
with the Merge tool in the Raster tab. 


Step 2 

Run the script and fill in the required parameters (path and 
name of the time-series stack, null value, the path to the 
folder where the results will be saved, the path and name of 
the output plot) as shown in Fig. 18. 


Results and Comments 


The script will generate three files in the results folder: 
IntervalLevel.csv, CategoryLevel.csv and TransitionLevel. 
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Intensity Analysis x 


Parameters 
LULCseries 


Log 


/home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/datos/stack.tif ». ss 


Nullvalue 
0.000000 
path 


/home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/Results € 


R Plots 


/home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/Results/plot.html 


R Console Output 


096 
Run as Batch Process... Qclose | 
Fig. 18 Exercise 1. Step 2. Intensity Analysis R script 
A B C D E F G F 


1 |,"Change Size"," 


Annual change", 


Uniform Change across Intervals", 


Interval Behavior" 


2 |1 2,112424,0.0227104877429948,0.0187202341993798,Fast 
3 |2 3,72918,0.0147299806557647,0.0187202341993798,Slow 


A 


Fig. 19 Result. from Exercise 1. Average annual rate of change for each time interval and for the entire period 


csv. A plot of the interval level is also produced. Plots of 
both the category and transition level have to be created 
from the Excel data sheet. 

The first Excel file, called IntervalLevel.csv (Fig. 19), 
shows the average annual rate of change for each time 
interval (in this case there are two) and the average annual 
rate of change for the entire period. When the average rate 
for each interval is compared with the overall average rate, 
we can assess whether the interval in question was one of 
slow or fast change. 

The automatically generated plot is shown in Fig. 20. The 
results show that land use change was more intense in the 
first time period than in the second. The average change rate 
over the entire period was 1.8, which means that change was 
relatively fast over the first period and relatively slow over 
the second. 

The CategoryLevel.csv document (Fig. 21) contains 
information regarding gross losses and gross gains and the 


amount of loss intensities and gain intensities for each land 
use category (in this case there are six categories). If these 
gains or losses are compared with the average annual rate 
that would exist if the change within each interval were 
distributed uniformly over the entire time series, we can see 
which land categories are relatively dormant/active. 

This table may be used to calculate the plots at the cat- 
egory level for each time interval. Figure 22 shows the result 
for the first time interval. 

This figure shows the intensity of change in the different 
categories, regardless of their relative size within the study 
area. The categories with short bars to the left of the blue line 
representing average, uniform intensity are relatively inac- 
tive or dormant, whereas those that extend to the right are 
relatively active. For example, Category 1 showed the 
highest intensity in terms of land use gains, while Category 4 
underwent more intense gains and losses than the average. 
At the other end of the scale, Category 3 was relatively 


Pontius Jr. Methods Based on a Cross-Tabulation Matrix ... 


181 


Time Interval Change Intensity 


Time Interval 


00 05 10 


Fig. 20 Result from Exercise 1. Time interval change intensity plot 


Category level Intensity Analysis for interval: 1-2 


6,"0","0","0","0","0.0227104877429948","Dormant","Dormant" 


Fig. 21 


dormant compared to the other land use categories, as both 
gain and loss intensity are located to the left of the blue line. 

Finally, the TransitionLevel.csv (Fig. 23) shows which 
transitions are intensively avoided or targeted taking into 
account the relative size of all the individual categories in the 
landscape. It compares the observed rate of gains from each 
category with a uniform rate of gains that would exist if the 
gains were made uniformly from among all the available 


15 20 
Annual Change (% of domain) 


! |,"Gross Loss","Gross Gain","Loss Intensity","Gain Intensity","Uniform Category Intensity","Loss Behavior","Gain Behavior" 
1,"516","33840","0.00345807420115805","0.185384025419086","0.0227104877429948","Dormant","Active" 
2,"60134","18721","0.0247443942018108","0.00783700896770504","0.0227104877429948","Active","Dormant" 
3,"10483","33408","0.00595948030921306","0.0187477798460244","0.0227104877429948","Dormant","Dormant" 

i |4,"40844","25480","0.0720383226097754","0.0461918885013379","0.0227104877429948","Active","Active" 

" 15,"447","975","0.010780436040903","0.0232187083253953","0.0227104877429948","Dormant","Active" 


Result from Exercise 1. Gross gains and losses and amount of loss and gain intensities for each category 


categories, so allowing us to identify those categories that 
are intensively avoided or targeted. This information may be 
used to calculate different plots showing the intensity for 
each transition and time interval. 

Figure 24 shows the first level of information in Fig. 23, 
that is, the annual transition size for gains in Category 1 in 
the first interval or period of time. The vertical blue line 
shows the uniform transition intensity. Categories on the left 
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0 5 10 15 20 


i Gain Intensity m Loss Intensity 


Fig. 22 Result from Exercise 1. Plot of gain and loss intensities per category 


A A | B | G | D | E j F G H I l J | K | 
1 [Transition level Intensity Analysis for interval: 1-2 

2 _|,"Annual Transition Size for Gain of 1","Transition Intensity for Gain of 1","Uniform Transition Intensity","Transition Behavior for Gain of 1" 
2,"33352","0.0137239338048158","0.00704839061747568","Target" 

3,"488","0.000277423103204805","0.00704839061747568","Avoid" 

5 |4,"0","0","0.00704839061747568","Avoid" 

6 |5,"0","0","0.00704839061747568","Avoid" 

7 |6,"0","0","0.00704839061747568","Avoid" 
8 

9 


> 
JLL 


) |1,"152","0.0010186575166202","0.00742865872652132" "Avoid" 

11 |3,"4405","0.00250419829839583","0.00742865872652132","Avoid" 
12 |4,"14164","0.0249816570719043","0.00742865872652132","Target" 
13 |5,"0","0","0.00742865872652132" "Avoid" 

4| 6,"0","0","0.00742865872652132" "Avoid" 


jon: Transition Size for Gain of 2","Transition Intensity for Gain of 2","Uniform Transition Intensity","Transition Behavior for Gain of 2" 


16 |,"Annual Transition Size for Gain of 3","Transition Intensity for Gain of 3","Uniform Transition Intensity","Transition Behavior for Gain of 3" 
17 |1,"0","0","0.0104685726605053", "Avoid" 

18 | 2,"7703","0.00316968883720605","0.0104685726605053","Avoid" 

19 |4,"25705","0.0453370160288972","0.0104685726605053" "Target" 

20 |5,"0","0","0.0104685726605053"," Avoid" 

?1 |6,"0","0","0.0104685726605053"," Avoid" 

22 

23 | ,"Annual Transition Size for Gain of 4","Transition Intensity for Gain of 4","Uniform Transition Intensity","Transition Behavior for Gain of 4" 
24 |1,"364","0.00243941668453785","0.00581292422027424","Avoid" 

25 | 2,"19079","0.00785077155978894","0.00581292422027424","Target" 

6 | 3,"5590","0.00317785890761242","0.00581292422027424","Avoid" 

27 |5,"447","0.010780436040903","0.00581292422027424","Target" 

28 |6,"0","0","0.00581292422027424","Avoid" 

29 

30 |,"Annual Transition Size for Gain of 5","Transition Intensity for Gain of 5","Uniform Transition Intensity","Transition Behavior for Gain of 5" 
j 1| 1,"0","0","0.000198620939169434","Avoid" 

32 | 2,"0","0","0.000198620939169434","Avoid" 

33 |3,"0","0","0.000198620939169434","Avoid" 

34 |4,"975","0.00171964950897392","0.000198620939169434" "Target" 

35 |6,"0","0","0.000198620939169434" "Avoid" 


Fig. 23 Results from Exercise 1. Comparison of the observed rate of gains with an uniform rate of gains, differentiating between transitions that 
are intensively avoided and transitions that are intensivily targeted 
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Fig. 24 Result from Exercise 1. Graph with the annual transition size for gains in category 1 in the first period of time 


of this line tend to avoid this transition (for example, the 
change from Category 3 to Category 1) while the categories 
that extend to the right of the blue line tend to target this 
transition (for example, the transition from Category 2 to 
Category 1). 

These analyses can also be used to validate land change 
in a series of maps with two or more time points. If there are 
large differences at the interval, category and/or transition 
level between the different time intervals, this means it 
would be difficult to validate the time series for simulating 
future trend scenarios, as the intensity of change over the 
time series has not been sufficiently stable or uniform to 
provide a base for future predictions. These differences may 
also be due to errors in the maps, which must be verified. 


7 Flow Matrix 
Description 


The Flow Matrix was developed by Runfola and Pontius 
(2013) to quantitatively measure the instability of annual 
land use change between time intervals. The aim was to 
identify anomalies relative to the total amount of change 
over the time series. Flow Matrix exercises require a series 
of maps with at least three time points. 

The Flow Matrix is a cross-tabulation matrix that shows 
the proportion of the study area that transitions from one 
category to another, excluding persistence. It assumes linear 
change over time during each time interval. It allows us to 
calculate: (a) the annual proportion of the study area that 


changes during each time interval and (b) the uniform annual 
proportion of the study area that changes over the entire time 
series, and the proportion of change that would have to be 
reallocated to different time intervals in order for change to 
be perfectly stable (R). When change is perfectly stable, R is 
zero. This value increases as change becomes more unstable. 

A vertical bar chart is produced showing the amount of 
annual land use change during each time interval as com- 
pared to the uniform annual change. 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


The Flow Matrix provides an analysis of the temporal extent 
at which phenomena are stable. It can be used to find out 
whether land use change takes place at a uniform rate over 
the course of the entire study period or if more change takes 
place during certain intervals. It can also be used to detect 
errors. If one particular interval is very different from the 
others in terms of its annual change rate, this may be due to 
errors in the mapping or the methodology. 

The Flow Matrix can also be used in the selection of par- 
ticular calibration intervals when developing future historical 
trend simulations, as the data should show the greatest pos- 
sible uniformity in past land use change. It can also be used to 
assess whether the results of a trend scenario are consistent, 
ie. whether the model simulates much more or much less 
change than actually happened in the historical series. 
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QGIS Exercise 


Available tools 


* Processing R provider Plugin 
Stable change flow matrix.rsx R script 
Flow matrix graf.rsx R script 


No specific tool is available in QGIS to calculate the Flow 
Matrix. We have developed two R scripts 
(Stable change flow matrix.rsx and Flow matrix graf.rsx) 
to this end. See Chapter “About this Book" for more detailed 
information about how to integrate R into QGIS and how to 
use R scripts such as the one applied in this exercise. 

The first script will generate two tables in CSV format 
with the stable and unstable data that would exist for the 
whole study period, respectively. The second script will 
generate two tables, in CSV format, presenting the annual 
change for each interval and the uniform rate, respectively. It 
also produces a plot showing this annual change and the 
uniform rate for the entire time series. 


Exercise 1. To validate a series of maps with two or 
more time points 


Aim 

To study and validate land change in the Ariége Valley study 
area using CORINE Land Use maps dated 2000, 2012 and 
2018. 

Materials 

CORINE Land Cover Map Val d’Ariége 2000 

CORINE Land Cover Map Val d’Ariége 2012 

CORINE Land Cover Map Val d'Ariége 2018 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


M. Paegelow et al. 
Execution 


If necessary, install the Processing R provider plugin and 
download the Stable change flow matrix.rsx and Flow - 
matrix. graf.rsx R scripts into the R scripts folder (processing/ 
rscripts). See Chapter “About this Book" of this book for 
further information about how to use the QGIS R script. 


Step 1 


Then, run the stable and unstable change script 
(stable change flow matrix.rsx) and fill in the required 
parameters: number of time points (in this case, 3), back- 
ground value (in this case, 0), land use maps and number of 
years between the time points. Make sure you save the files 
in the correct folder (Fig. 25). 


Step 2 


Now, run the Annual Change Rates script (Flow matrix - 
graf.rsx). Fill in the parameters as in the previous section 
(Fig. 25) to generate the plot. 


Results and Comments 
Step 1 


generates two CSV files containing the data regarding 
unstable change (Fig. 26) and stable change (Fig. 27). The 
first file shows the proportion of change that would have to 
be reallocated to different time intervals in order for change 
to be perfectly stable (R). If change is perfectly stable, then 
R is zero. The R value increases as change becomes more 
unstable. In our case, R is 0.06, which means that 696 of 
change is unstable. 

Stable change is the percentage of change that is stable in 
our study area between the first and the second intervals. 
This data is used to calculate the R value (R = 1 — stable 
change). In this case R = 1 — 0.94; R = 0.06. 


Step 2 


produces a chart showing the annual amount of land use 
change (expressed as a proportion of the study area) during 
each time interval and the uniform rate that would exist if the 
annual changes were distributed uniformly across the entire 
time period. This is shown as a horizontal line in Fig. 28. It 
also generates a CSV file showing the uniform change 
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Q Stable and unstable change 


Parámetros | Registro 
Numer of time T l 

3.000000 (x 
Land use map t1 

S" CLC_2000 [EPSG:32631] v 
Land use map t2 

iH” CLC_2012 [EPSG:32631] M 
Land use map t3 [optional] 

a" CLC, 2018 [EPSG:32631] M 
Land use map t4 [optional] 


fap 


Land use map t5 [optional] 


Background value 
0.000000 
Number of years t1 t2 
12.000000 €i 
L Number of vears t2 t3 [oocionall. bed 


TT 


fap] 


Fig. 25 Exercise 1. Step 1. Stable and unstable change R script 


2 |1,0.0600907151823836 2 |1,0.939909284817616 


Fig. 26 Result from Exercise 1. Rate of unstable changes Fig. 27 Result from Exercise 1. Rate of stable changes 
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Annual Change Rate 


0.05 0.10 015 O20 à 025  À 030 0.35 


0.00 


Annual change t1 t2 


Annual change t2 t3 


Fig. 28 Result from Exercise 1. Graph with the annual change rate for the two time periods that have been analysed 


A B 
1 ^" 1 
2 |1,0.208002602215331 
3 


Fig. 29 Result from Exercise 1. Rate of uniform change 


calculation, which is also expressed as a proportion of the 
study area (Fig. 29). 

The tool also provides us with data about the annual land 
use change during each interval, as a percentage of the study 
area (Fig. 30). In our example, this is 0.19 for the first time 
interval and 0.24 for the second. 

These results show that land use change did not occur at 
the same uniform rate over the course of the study period 
and there was more change in the second interval. It should 
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C D 


2 |1,0.189254064524957,0.245499677596079 


3 


Fig. 30 Result from Exercise 1. Annual land change rates for each time period 


be noted than if one time interval is very different from the 
others in terms of the amount of annual change (this did not 
happen in our case), this may be due to potential mapping 
errors. 

The maps validated here could be used for simulating 
future trend scenarios, as there is not much difference 
between the intervals in terms of the annual rate of land use 
change. 
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Abstract 


In Land Use Cover Change (LUCC) modelling, soft maps 
are often produced to express the propensity of an area to 
land use change. These maps are generally prepared in 
raster format, and have values of between O and 1, 
indicating the propensity of each pixel to change. In the 
literature, they are referred to as suitability, change 
potential or change probability maps. These maps are 
sometimes considered as the final product of a model (e.g. 
map of deforestation risk), but they can also serve as 
intermediate products that simulate the changes from 
which a hard-simulated land use/cover map can later be 
prepared using, for example, a cellular automaton. In both 
cases, it is essential to evaluate the soft map's ability to 
identify the areas that are most susceptible to change. One 
way of assessing this ability is to compare the spatial 
coincidence between the real changes observed on the 
ground and the values estimated by the soft map. One 
would expect real change areas to coincide with high 
change potential values (near 1) and real no-change areas 
with low change potential values (near 0). This compar- 
ison can be made using various statistical approaches 
including Correlation Coefficient (Sect. 1), the Receiver 
Operating Characteristic (ROC) (Sect. 2) and the Differ- 
ence in Potential (DiP) (Sect. 3). Other measures, such as 
total uncertainty, quantity uncertainty and allocation 
uncertainty (Sect. 4), are used exclusively in the analysis 
of soft maps. In this chapter, we describe the fundamental 
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steps involved in these four statistical approaches to 
validating the soft maps produced by a model. The four 
sections are illustrated with specific cases: to validate soft 
maps produced by the model, to validate soft maps 
produced by the model against a reference map and to 
validate soft maps produced by various models against a 
reference map. We use the Ariége database to validate the 
different soft maps (change potential and suitability maps) 
produced by the model by comparing them with real land 
use maps of the Ariége Valley for two dates (CORINE 
2012 and 2018). All these validation techniques are 
carried out using raster data. As commented earlier, the 
soft maps produced by the model are continuous, ranked 
variables. We designed exercises using this original 
format. In other chapters of this book, the soft maps 
produced by the model are validated after reclassification 
of the original maps. 


Keywords 


Soft maps * Correlation * Receiver Operating 
Characteristic * Difference in Potential * Uncertainty * 
Validation 


Preliminary QGIS Exercise 


Available tools 


* Semi-Automatic Classification Plugin 
Tab: Postprocessing 
Section: Land cover change 


Before beginning the exercises in this chapter, we need to 
obtain a map of the real transitions that took place between 
two land use categories (Category 2 to Category 1 and Cat- 
egory 3 to Category 1) between 2012 and 2018. Of all the 
various tools offered by QGIS (see Sect. 1 in Chapter “Basic 
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and Multiple-Resolution Cross-Tabulation to Validate Land 
Use Cover Maps", about basic Cross-Tabulation), in this 
exercise we will be using Land cover change from the 
"Semi-Automatic Classification Plugin". 


Exercise 1. To create binary change maps for two 
transitions 


Aim 


To create binary change maps for two transitions (2 to 1 and 
3 to 1) using CORINE Land Use maps for the years 2012 
and 2018. For each transition, each pixel on the map is 
allocated a value of 1 or 0 depending on whether or not the 
transition occurred. 


Materials 


CORINE Land Cover Map Val d’Ariége 2012 
CORINE Land Cover Map Val d'Ariége 2018 


Requisites 
All maps must be raster and have the same resolution, extent 


and projection. 


ChangeCode ReferenceClass 


M. T. Camacho Olmedo et al. 
Execution 
Step 1 


To create the map of real change, open the “Semi-Automatic 
Classification Plugin" and, in the tab “Postprocessing”, 
select the option Land cover change. Then, fill in the 
required parameters: the earlier map in the reference classi- 
fication (CORINE 2012) and the more recent map in the new 
classification (CORINE 2018). Check the option “Report 
unchanged pixels". 

QGIS then creates an output raster and a table, stored in 
CSV format, showing all the different combinations 
observed between the two input maps and the code with 
which these combinations are represented in the output 
raster. These combinations (those with 1 or more pixels) and 
the number of pixels affected by them are presented in 
Fig. 1. 


Step 2 


The raster obtained in Step 1 is reclassified twice: (i) to 
represent the areas in which a change was observed from 2 
to 1 and those in which there was no change and (ii) to do 
the same for the transition from 3 to 1. 

To reclassify the raster, open the Reclassify by table tool 
and allocate a new code 1 to ChangeCode 16 (transition 
from 2 to 1) and a new code 0 to ChangeCodes 17, 18, 19 


NewClass PixelSum 


3,315,666 
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Fig. 1 Results from Exercise 1. Step 1. Combinations observed between two input maps, code in the output raster and number of pixel affected 
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and 21 (pixels that belonged to Category 2 in 2012 but did 
not change to Category | in 2018). All these ChangeCodes 
are highlighted in the Fig. 1 in green. All the other Chan- 
geCodes (i.e. those with a reference class other than 2 which 
cannot possibly undergo the transition from 2 to 1 and are 
therefore considered as No Data) must be allocated a new 
code —99. Save the output raster as TrueChange2tol. 

As regards the transition from 3 to 1, allocate a new code 1 
to ChangeCode 23 (pixels that were Category 3 in 2012 and 
changed to Category 1 in 2018) (in bold type) and a new code 
0 to ChangeCodes 24, 25, and 26 (pixels that were Category 


10 km 
LT 1 


C] Changed area from 2 to 1 
E Not changed area from 2 to 1 
B Not candidate area 


Fig. 2 Results from Exercise 1. Binary change map for transition 


from 2 to 1 


3 in 2012 but did not change to Category 1 in 2018). AII the 
candidate ChangeCodes are highlighted in orange in the 
Fig. 1. A new code —99 must be allocated to the remaining 
ChangeCodes (i.e. those which cannot undergo this transi- 
tion). Save the output raster as TrueChange3tol. 

In Fig. 2, the areas that changed from 2 to 1 are shown in 
white, the Category 2 areas that did not change to 1 are 
shown in grey and the non-candidate areas (i.e. those with a 
reference class other than 2) appear in black. In Fig. 3, the 
areas that changed from 3 to 1 are shown in white, the 
Category 3 areas that did not change to 1 in grey and the 
non-candidate areas in black. 


x 


10 km 


C] Changed area from 3 to 1 
B Not changed area from 3 to 1 


Bl Not candidate area 


Fig. 3 Results from Exercise 1. Binary change map for transition 
from 3 to 1 
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1 Correlation 
Description 


Correlation is a statistical measure that evaluates the extent 
to which two variables are related. This means that when one 
variable changes in value, the other variable also tends to 
change. Correlation coefficients are quantitative metrics that 
measure both the strength and the direction of this tendency 
of two variables to vary together. 

The correlation coefficients range from 1 to —1. A coef- 
ficient of 1 shows a perfect positive correlation, while a 
coefficient close to zero indicates that there is no relationship 
between the variables. A coefficient of minus 1 indicates a 
perfect negative correlation, that is, as one variable increa- 
ses, the other decreases. 

The Pearson correlation measures the linear correlation 
between two variables. Spearman's correlation is the 
non-parametric version of the Pearson correlation and is 
based on the rank order of the variables rather than on their 
values. Spearman's correlation is often used to evaluate 
non-linear relationships or relationships involving ordinal 
variables. 


Utility 


Exercises 


1. To validate soft maps produced by the model against a reference 
map of changes 


Correlation analysis is useful for making a rapid assessment 
of a soft map expressing the propensity of an area to change. 
Assuming that the change map is coded 0 for no change and 
1 for change, we would expect a positive value close to 1 if 
the soft map is successfully attributing high change potential 
values to change areas and low change potential values to 
no-change areas. A correlation coefficient of O indicates a 
completely random model. A negative coefficient indicates 
that the model is making incorrect predictions in that it 
produces soft maps in which low change potential values are 
awarded to areas in which changes are in fact taking place 
(Bonham-Carter 1994; Camacho Olmedo et al. 2013). 

We used Pearson and Spearman correlations to assess the 
correlation between the map showing real changes and its 
respective change potential map. The correlation between a 
binary variable and a continuous variable is known as a 
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Point-Biserial Correlation and measures the strength of 
association between the two. 


QGIS Exercise 


Available tools 


* Processing R provider plugin 
Correlation.rsx R script 


We have created an R script to calculate in QGIS the 
Pearson and Spearman correlation coefficients. This script 
performs a sampling of the images and calculates both the 
Pearson and Spearman correlations. 


Exercise 1. To validate soft maps produced by the 
model against a reference map of changes 


Aim 


To calculate the correlation between the map showing real 
change from 2 to 1 and the corresponding map of potential 
change. 


Materials 

TrueChange2tol (calculated in the preliminary QGIS exer- 
cise in this chapter) 

Transition potential map from agricultural to artificial areas 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


Execution 


If necessary, install the Processing R provider plugin and 
download the Correlation.rsx R script into the R scripts 
folder (processing/rscripts). For more details, see Chapter 
" About this Book". 
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HB 0 (not potential change) 
C] 0.977082 (maximum potential change) 


Fig. 4 Exercise 1. Initial map: Change potential map from 2 to 1 


Initial maps 


The initial maps for comparison are the TrueChange2tol 
map (see the preliminary QGIS exercise in this chapter) and 
the map showing the change potential from 2 to 1 (Fig. 4). 
Values range from 0 (black) to 0.977082 (white), the value 
for the areas with maximum change potential. The areas with 
a value of 0 (black) are those in which there is no potential 
for change. 


Step 1 


Run the script and fill in the required parameters (names of 
the two maps, proportion of pixels to be sampled, Null 
value) as shown in Fig. 5. The Null value enables us to 
exclude part of the image from the calculations, for instance, 
the pixels with no potential for change. 

The script samples the images in order to speed up the 
computing of the correlation coefficients. It then displays 
both the Pearson and Spearman correlation coefficients and a 
scatterplot in the log files (Figs. 6 and 7). 


Results and Comments 


As can be seen in Fig. 6, both maps show a low positive 
correlation (Pearson = 0.13, Spearman = 0.12), which means 
that the real changes tend to occur more frequently in the 
areas with higher change potential values. However, as can 
be seen in Fig. 7 and by the low value of the coefficients, the 
difference between the potential values for change and 
no-change areas is quite small. 
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Correlation X 
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Parameters | Log Correlation 
Map1 /home/jfmas/.local/ 
=" TrueChange2to1 [EPSG:32631] Y»||...| |share/QGIS/QGIS3/ 


profiles/default/ 
Map2 processing/rscripts/ 


$' 00 12 18 transition potential from 2 to 1[EPSG:32631]- |...  Correlation.rsx 
PercentSampled 
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R Plots 
Drive/MISARCHIVOS/proyectos/INCERTIMAPS/Results/plot.html 
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Fig. 5 Exercise 1. Step 1. Correlation R script 
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| Parameters | Log Correlation 
LIIHNMIIPNSg LU ULUJ J.U.U, VVUAL Je Va Ty rJ V... a 
[1] "/home/jfmas/R/x86 64-pc-linux-gnu-library/ /home/jfmas/.local/ 
4.0/raster" share/QGIS/QGIS3/ 
Loading required package: sp profiles/default/ 
[1] "Pearson Correlation" processing/rscripts/ 
[1] "Pearson = 0.126070119474099" Correlation.rsx 


[1] "Spearman Correlation" 

[1] "Spearman = 0.114852136594995" 
null device 

1 


4 


0% 


Run as Batch Process... Oc close | 


Fig. 6 Results from Exercise 1. Pearson and Spearman correlation coefficients 
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Fig. 7 Result from Exercise 1. Scatterplot 


2 Receiver Operating Characteristic (ROC) 
Description 


Receiver Operating Characteristic (ROC) analysis enables 
users to evaluate binary classifications with continuous out- 
put or rank-order values. In spatial modelling, ROC analysis 
is used to assess soft maps such as probability or suitability 
maps, which present the sequence in which the model selects 
cells to determine the occurrence of binary events, such as 
change versus no change (Camacho et al. 2013). The prob- 
ability map can be compared with the observed binary event 
map so as to assess the spatial coincidence between the event 
and the probability values. An accurate predictive model 
would produce a probability map in which the highest ranked 
probabilities coincide with the actual event. 

ROC applies thresholds to the probability map to produce 
a sequence of binary predicted event maps and assess the 
coincidence between predicted and real events. A curve is 
obtained in which the horizontal axis represents the false 
positive rate (proportion of no-event cells modelled as an 
event) and the vertical axis the true positive rate (proportion 
of true event cells modelled as an event). 

A standard metric based on the ROC curve is the area 
under the curve (AUC). If the actual events coincide per- 
fectly with the highest ranked probabilities, then the AUC is 
equal to one. A random probability map produces a curve in 
which the true positive rate equals the false positive rate at 
all threshold points, and AUC is therefore 0.5. Probability 
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maps that produce a ROC curve below the diagonal (AUC « 
0.5) have less predictive accuracy than a random map (Mas 
et al. 2013; Pontius and Parmentier 2014). 


Utility 


Exercises 


1. To validate soft maps produced by the model against a reference 
map of changes 


The main application of ROC analysis in spatial modelling is 
in the assessment of maps that predict events such as land 
use/cover change, species distribution, disease and disaster 
risks. 


QGIS Exercise 


Available tools 


* Processing R provider plugin 
* ROCR package 
ROCAnalysis.rsx R script 


QGIS does not provide any tool for ROC analysis, although 
R provides several packages to this end. We implemented 
the ROCAnalysys.rsx R script in QGIS using the QGIS 
Processing R provider plugin and the ROCR package to plot 
the ROC curve and calculate the AUC (Sing et al. 2005). 
This script resamples the images to reduce the number of 
observations and carry out the standard ROC analysis. 


Exercise 1. To validate soft maps produced by the 
model against a reference map of changes 


Aim 
To assess the accuracy of a change potential map using ROC 
analysis. 


Materials 


TrueChange2tol (calculated in the preliminary QGIS exer- 
cise in this chapter) 
Transition potential map from agricultural to artificial areas 
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ROC Analysis X 
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i" 00 12 18 transition potential from 2 to 1 [EPSG:32631] Y share/QGIS/QGIS3/ 
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| ven processing/rscripts/ 
| || € TrueChange2to1 [EPSG:32631] ~ ||... || | ROCAnalysis.rsx 
PercentSampled 
10.000000 = 
NullValue 
-99.000000 a |? 
R Plots 


[home/jfmas/pCloudDrive/proyectos/INCERTIMAPS/Results/plot.html 


R Console Output 


| [Save to temporary file] 


0% 


Run as Batch Process... 


Fig. 8 Exercise 1. Step 1. ROC Analysis R script 


Requisites 


All maps must be in raster format and have the same reso- 
lution, extent and projection. 


Execution 


If necessary, install the Processing R provider plugin, and 
download the ROCAnalysis.rsx R script into the R scripts 
folder (processing/rscripts). For more details, see Chapter 
“About this Book”. 


Step 1 


Then run the script and fill in the required parameters 
(Fig. 8): Probability map is a soft prediction map for the 
event; Event is a binary map that indicates the occurrence, or 
not, of the event. This map can have NullValue cells for the 
areas that are not affected by the prediction. 

The maps have large numbers of both “event” and 
“non-event” cells, although there are normally more “event” 
cells than “non-event” cells. The PercentSampled parameter 


Ocse 


uses random sampling to reduce the number of non-event 
cells observed. 


Results and Comments 


The script carries out a sampling of the cells, plots the ROC 
curve and calculates the AUC. The ROC curve (Fig. 9) is 
saved, and the AUC value is displayed in the R console. 
We assessed the change potential map for the transition 
from Category 2 (agriculture) to Category 1 (built-up) using 
ROC analysis. An AUC of 0.74 was obtained. We can 
therefore conclude that this predictive map was reasonably 
successful at identifying the agricultural areas that were most 
likely to be converted to built-up over the period 2012-18. 


3 Difference in Potential (DiP) 
Description 


DiP is based on the Peirce Skill Score (PSS): 


PSS = H—F 
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Fig. 9 Results from Exercise 1. The ROC curve 


where H = HITS, i.e. pixels in which both the real maps and 
the simulation show change and F = FALSE ALARMS, i.e. 
pixels in which the real maps show persistence but the 
simulation shows change. 

In DiP, proposed by Eastman et al. (2005), the simulation 
maps are soft (change potential, suitability maps...) rather 
than hard maps. DiP therefore compares the relative weight 
of the potential (in a generic sense) that is allocated to areas 
that changed, i.e. HITS, and the relative weight of the 
potential allocated to areas that did not change, i.e. 
FALSE ALARMS. Results are normally between 1.0 (per- 
fect predictor) and 0 (prediction no better than random). 
Negative values are also possible (prediction systematically 
incorrect). In other words, DiP is defined as the difference 
between the mean potential in the change areas and the mean 
potential in the no-change areas (Pérez-Vega et al. 2012). 


Utility 


Exercises 


1. To validate soft maps produced by the model against a reference 
map of changes 

2. To validate soft maps produced by various models against a 
reference map of changes 


DiP is used as a tool for validating soft maps in a modelling 
exercise, by assessing their predictive accuracy. Users can 
validate and compare several soft maps simulated by the 
same model or several soft maps simulated by different 
models. Pérez-Vega et al. (2012) validated a map of overall 
change potential created by superimposing all the potential 
maps produced by a model. 
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As these soft maps are rank-order indices, but real land 
use typically includes a categorical legend, we would expect 
each real category or transition to be allocated where the 
values are highest in soft-classified maps, whereas other 
categories or transitions would be allocated where the values 
are lower. The validation methods therefore have to compare 
a rank image with a Boolean image in which the real cate- 
gory or transition is located. 

Compared to other assessment techniques such as ROC 
analysis (see previous section), which is based on a relative 
threshold, DiP analysis is a measure of absolute threshold. 
As Eastman et al. (2005) suggested, PSS, DiP and similar 
procedures could be used in models based on absolute per- 
formance, while ROC could be used in models based on 
relative performance. DiP and ROC present a different pic- 
ture, in that in DiP the results show greater variability 
between the potential maps and models. 


QGIS Exercises 


Available tools 


* Processing Toolbox 
Raster Analysis 
Raster layer zonal statistics 
* LecoSPlugin 
Landscape statistics 
Zonal statistics 


The Difference in Potential is a simple subtraction between 
average values from two maps. The required functions are 
related to zonal statistics which is why in these exercises we 
will be using the Raster layer zonal statistics tool. 


Exercise 1. To validate soft maps produced by the 
model against a reference map of changes 


Aim 


To validate and compare two change potential maps (soft 
maps), obtained from the same model, against a reference 
map-CORINE Land Use map of real changes (from 2012 to 
2018). 


Materials 


TrueChange2tol (calculated in the preliminary QGIS exer- 
cise in this chapter) 
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Fig. 10 Exercise 1. Initial map: change potential map from 3 to 1 
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TrueChange3tol (calculated in the preliminary QGIS exer- 
cise in this chapter) 

Transition potential map from agricultural to artificial areas 
Transition potential map from forests to artificial areas 


Requisites 


All maps must be raster and have the same resolution, extent 
and projection. 


Execution 
Initial maps 


In this exercise, we will be using the TrueChange2tol and 
the TrueChange3tol maps (see the preliminary QGIS exer- 
cise in this chapter), the change potential map from 2 to 1 
(see Sect. 1) and the change potential map from 3 to 1 
(Fig. 10). Values are from 0 (black) to 0.99753 (white), the 
latter corresponding to the areas with the maximum potential 
for change. The areas in which this change is not predicted 
are allocated a value of 0. 


Step 1 


We open Raster layer zonal statistics (located in the Pro- 
cessing Toolbox) to extract the mean values from the change 
potential map from 2 to 1 (Input layer) using the 
TrueChange2tol map as the Zones layer (Fig. 11). 


Raster Layer Zonal Statistics x 
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Input layer 

> 00 12 18 transition potential from 2 to 1 [EPSG:32631] 
Band number 

Band 1 (Gray) 
Zones layer 

> TrueChange2to1 [EPSG:32631] 
Zones band number 

Band 1 (Gray) 

v Advanced parameters 

Reference layer 


Input layer 


Statistics 


' Raster layer 
zonal | 
S| statistics | 


/home/jfmas/pCloudDrive/MISARCHIVOS/proyectos/INCERTIMAPS/Results/Table DiP 2to1.csv 


v Open output file after running algorithm 


096 


GO Help Run as Batch Process... 


Fig. 11 Exercise 1. Step 1. Raster layer zonal statistics 


Qcloe | QRun | 
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Step 2 


We then repeat the exercise with the change potential map 
from 3 to 1 (Input layer) using the TrueChange3tol map as 
the Zones layer. 


Results and Comments 


The mean value for change potential from 2 to 1 in the 
candidate areas that actually change to Category 1 is 0.43, 
while in the candidate areas that did not change, the mean 
value is 0.20. Therefore, the Difference in Potential is 0.23. 
In spite of the fact that the change potential is twice as high 
in the areas that changed to Category 1 than in those that did 
not change, the absolute potential (about 0.43) is quite low. 

As regards the change from 3 to 1, the mean value for 
change potential in the candidate areas that change to Cat- 
egory | is 0.31, while in the candidate areas that did not 
change, the mean value is 0.02. Therefore, the Difference in 
Potential is 0.29. In spite of the fact that the absolute dif- 
ference is quite low, it is important to highlight that the 
change potential value in the candidate areas that did not 
change is almost zero. From this point of view DiP throws 
up interesting results. 

The fact that these soft maps have similar DiP values 
means that they have similar predictive capacity. This is 
slightly higher in the map charting potential change from 3 
to 1, although we should also bear in mind that the change 
from 3 to 1 affects just one small, contiguous area. 


Exercise 2. To validate soft maps produced by 
various models against a reference map of changes 


Aim 


To validate and compare two soft maps obtained from two 
different models against a reference map-CORINE Land Use 
map of real changes (from 2012 to 2018). 


Materials 

TrueChange2tol (calculated in the preliminary QGIS exer- 
cise of this chapter) 

Transition potential map from agricultural to artificial areas 
Markovian probability map for artificial areas Ariége Valley 


Requisites 


All maps must be raster and have the same resolution, extent 
and projection. 
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Ill 0 (not change) 
C 0.997692 (maximum probability) 


Fig. 12 Exercise 2. Initial map: Markovian probability map for 
Category 1 


Execution 
Initial maps 


In this exercise, we will be using the TrueChange2tol (see 
the preliminary QGIS exercise in this chapter), the change 
potential map from 2 to 1 (see Sect. 10.1) and the Markovian 
probability map for Category 1 (Fig. 12), with values from 0 
(black) to 0.997692 (white), the latter corresponding to the 
areas with the highest probability to be Category 1. 


Step 1 
In order to obtain the mean values from the change potential 


map for the transition from 2 to 1, follow the process set out 
in Exercise 1 of this section. 
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Step 2 


We now use the Raster layer zonal statistics tool to extract 
the mean values from the probability map for Category 1 
(Input layer) using the TrueChange2tol map as zones 
(Zones layer). In other words, in both soft maps (change 
potential map and probability map) we extract the mean 
values using the same map as zones. 


Results and Comments 


As commented in Exercise 1 of this section, the mean value 
for change potential from 2 to 1 in the candidate areas that 
did actually change to Category 1 is 0.43; while in the 
candidate areas that did not change, the mean value is 0.20. 
This means that the Difference in Potential is 0.23. In spite of 
the fact that the change potential is twice as high in the areas 
that changed to Category 1 than in those that did not change, 
the absolute potential (about 0.43) is quite low. 

The mean value for the probability of Category 1 in the 
candidate areas that did actually change from Category 2 to 
1 is 0.013, while in the candidate areas that did not change, 
the mean value is 0.0098. The Difference in Potential is 
therefore 0.0032. This very small difference means that the 
only Markovian-generated probability map has no predictive 
value. 

The two soft maps, each generated by a different model to 
predict the changes in land use and cover, produce highly 
varying results: some areas considered to have high change 
potential by one model are attributed low change potential 
by the other. 

In this case, it is important to remember that we are 
comparing two quite different change potential maps. Firstly, 
a change potential map in which only one specific transition 
is evaluated (in this case from 2 to 1) and therefore only one 
source category (Category 2) is considered for its potential 
for change to the target category (Category 1). Secondly, a 
suitability map, which generates the probability of any part 
of the study area belonging to a particular target category (in 
this case Category 1) at the end of the period regardless of its 
original source category. However, when comparing the 
outputs of these models, we evaluated the same transition in 
both soft maps and validated them against the same real 
change. 

The second main difference is that the change potential 
map is based not only on two LUC maps but also on selected 
drivers, while the Markov Probability map is computed 
without additional knowledge (drivers). The conclusion is 
that when comparing different maps, it is important to bear 
in mind that the data may have been obtained in different 
ways. 


M. T. Camacho Olmedo et al. 


4 Total Uncertainty, Quantity Uncertainty, 
Allocation Uncertainty 


Description 


In an exhaustive state of the art on the accuracy of model 
outputs, Kriiger and Lakes (2016) proposed an uncertainty 
measurement for probability maps such as soft predictions, 
which could be considered equivalent to the disagreement 
indices for hard prediction maps. 

These authors proposed a measurement of the probability 
of predictions being misses (PM) (also called omissions) or 
false alarms (PF) (also called commissions) for soft predic- 
tion maps. They also introduced three uncertainty measures: 


QU Quantityuncertainty = 2 x (PM — PF) 
AU Allocationuncertainty = 4 x (PF) 
TU Total Uncertainty = QU + AU 


where PM = the average for the values less than 0.5 (pixel 
values equal to or higher than 0.5 are previously set to zero); 
PF = average of soft prediction map where values less than 
0.5 are set to zero while values equal to or higher than 0.5 
are converted into their complement to 1 (0.8 becomes 0.2; 
0.51 becomes 0.49). 


Utility 


Exercises 


1. To validate soft maps produced by the model 


The uncertainty indices proposed by Kriiger and Lakes 
(2016) for probability maps such as soft predictions are 
equivalent to disagreement indices for hard classified maps 
such as hard predictions. Theses indices allow us to evaluate 
the uncertainty of soft predictions by comparing the level of 
uncertainty in soft prediction outputs with the level of dis- 
agreement in hard prediction outputs. 


QGIS Exercise 


Available tools 


* Raster 
Raster Calculator 
Raster Layer Statistics 


There is not any specific tool implemented in QGIS or R that 
allows to directly calculate the uncertainty indices proposed 
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Fig. 13 Exercise 1. Initial map: Soft prediction map 


by Krüger and Lakes (2016). However, these can be easily 
obtained through common spatial analysis tools, such as 
Raster Calculator and Raster Layer Statistics. 


Exercise 1. To validate soft maps produced by the 
model 


Aim 


To validate the soft map produced by the LCM model for the 
Ariége Valley case study. 


Materials 


Soft prediction LCM Val d'Ariége 2018 


Requisites 

The map must be raster. 
Execution 

Initial maps 


Figure 13 shows the generated soft prediction map (2018) 
generated by Land Change Modeler (LCM) for Ariége Val- 
ley, based on CLC 2000 and CLC 2012 training dates. The 
map values range from 0, which means minimal probability 
to change, to 1, which means maximal probability to change. 


Step 1 


To calculate the PM map (probability of being a miss), we 
use the Raster Calculator twice. First, we generate an 
intermediate map in which all pixel values less than 0.5 are 
coded as 1: calculator expression = "CLC predict 2018 - 
soft UTM G1" < 0.5. Then, we multiply this mask (inter- 
mediate map named *TMP 1") by the soft prediction map: 
calculator expression = "TMP 1091" * “CLC_pre- 
dict_2018_soft_UTM@1”. As a result, we obtain the PM 
map (Fig. 14). 


Step 2 


To calculate the PF (probability of being a false alarm) map, 
we need to use the Raster Calculator again. With the cal- 
culator, we can first compute an intermediate map in which 
all pixel values equal to or greater than 0.5 are coded as 1: 
calculator expression = *CLC predict 2018 soft UTM Q1" 
> = 0.5. Then, we subtract the values of the soft prediction 
map from 1 before multiplying it by the mask (intermediate 
map, here named *TMP 2^"): calculator expression = (1- 
"CLC predict 2018 soft UTMQ1") * "TMP 291". As a 
result, we obtain the PF map (Fig. 15). 


Step 3 


Finally, we use the Raster Layer statistics tool to calculate 
the average PM and PF values from the corresponding maps. 


2018 PMaveraze = 0.00963 
2018. PFaveraze = 0.00577 


Step 4 


Once we have obtained the PM and PF values, we can 
calculate the Quantity Uncertainty (QU), Allocation Uncer- 
tainty (AU) and Total Uncertainty (TU) following the for- 
mulas provided by Krüger and Lakes (2016): 
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Fig. 14 Exercise 1. Step 1. PM (probability of being a miss) map 


QU = 2 x (0.00963 — 0.00577) = 0.00772 
AU — 4 x (0.00577) — 0.02308 
TU = 0.00772 + 0.02308 = 0.0308 


Results and Comments 


All three uncertainty indices are very low because only a 
small proportion of the pixels change category. The soft 
prediction map (Fig. 13) indicates that persistence is the 
dominant trend and there are very few high-probability 
soft-predicted changes. 

For this dataset, quantity uncertainty is about three times 
lower than allocation uncertainty. It is important to bear in 
mind that areas with low rates of change also have lower 
uncertainty rates, so limiting the significance of these 
indices. 
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Fig. 15 Results from Exercise 1. Step 2. PF (probability of being a 
false alarm) map 
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Abstract 


When validating Land Use Cover (LUC) products, pat- 
tern analysis can be used to assess the agreement between 
the patterns of two maps. It therefore complements other 
methods and techniques that focus exclusively on the 
quantity (proportions) and allocation agreement between 
the categories. Spatial metrics are the first step for any 
analysis of the patterns of categorical maps. With the 
wide range of spatial metrics available, it is possible to 
fully characterize the pattern of any map. It can also be 
characterized in greater detail using other more complex 
techniques, as explained in the next chapter of this book 
(Chap. “Advanced Pattern Analysis to Validate Land Use 
Cover Maps"). This chapter provides an introduction to 
the essentials of pattern analysis by explaining the theory 
behind the calculations of spatial metrics. To this end, we 
offer examples of how to use spatial metrics to validate 
LUC maps (either single maps or series) and Land Use 
Cover Change (LUCC) simulations from modelling 
exercises. We also include two example exercises illus- 
trating how spatial metrics can be used for general 
purposes of pattern characterization without validation. 
Despite all the spatial metrics currently available, in this 
chapter we will be focusing exclusively on the most 
common and most suitable metrics for carrying out the 
type of analyses being performed here. Most of the spatial 
metrics proposed in the literature are closely related. This 
means that users must select the metrics that provide most 
information for their specific cases, so as to avoid 
reiteration and make sure that clear conclusions are 
reached. The example exercises were drawn up with maps 
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(CORINE, SIOSE) and modelling exercises from the 
Asturias Central Area and Ariége Valley databases. 


Pattern analysis * Landscape Ecology * Landscape 
Metrics * Spatial Metrics 


1 Spatial Metrics 
Description 


Spatial metrics are a set of indices or metrics that were first 
developed within the field of landscape ecology (Forman 
1995), which is why they are often referred to as landscape 
metrics. Landscape metrics were designed to quantitatively 
characterize the pattern of a landscape, and its relationship 
with landscape processes. Nowadays, they are also widely 
used to characterize the pattern of categorical maps. When 
used for this purpose, they are generally referred to as spatial 
metrics (Herold et al. 2005). 

Spatial metrics were initially developed for raster data, 
although some of them have also been adapted for calcula- 
tion with vector data, for which the polygon is the unit of 
measurement. For raster data, the reference concept for 
calculating the metrics is the patch. 

A patch is defined as a contiguous area of pixels 
belonging to the same category. The number and shape of 
the patches in a raster will depend on the neighbourhood rule 
applied (Fig. 1). Under the 4-cell neighbourhood rule, two 
pixels with the same value are considered to belong to the 
same patch if one is immediately above, below or adjacent to 
another pixel. An 8-cell neighbourhood will also consider 
pixels that are diagonal to each other as part of the same 
patch. 

Spatial metrics can be calculated at three different levels: 
per patch, per category or for the whole map (landscape 
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Fig. 1 Examples of patch 
delineation according to 4-cell 
and 8-cell neighbourhood rules 
for an example landscape 


Raster (example) 


level). In the first case, each metric is calculated for every 
single patch. In the second case, the metrics are calculated 
for all the patches belonging to every single category on the 
map. In the last case, the metrics are calculated for the map 
as a whole. Not all metrics can be calculated for the three 
levels of analysis, but some of them are only available for 
certain levels of analysis. 

There is a wide variety of metrics available, and new ones 
are regularly being proposed (Botequilha Leitao et al. 2006; 
Jaeger 2000; Mcgarigal 2018). Most of them are closely 
correlated. This means that despite the wide number of 
metrics available, many of them provide the same or very 
similar information. 

Spatial metrics are usually classified into groups accord- 
ing to the information they provide: area, density and edge 
metrics; shape metrics; connectivity metrics and diversity 
metrics. The first group (area, density and edge) provides 
information about the area and perimeter of the patches. 
Shape metrics assess the complexity of the shape of the 
patches, based on their area and perimeter, while connec- 
tivity metrics quantify the degree to which patches relate to 
each other (how connected they are) and are usually calcu- 
lated at the category level. Finally, diversity metrics quantify 
the heterogeneity of the map and can only be computed at a 
landscape level. 

For an overview of the range of metrics available and a 
description, please see Botequilha Leitao et al. 2006, Jaeger 
2000; Mcgarigal 2018. 
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4-cell neighbourhood rule 


8-cell neighbourhood rule 


10 patches 


8 patches 


Utility 


Exercises 


1. To validate a map against reference data / map 

2. To validate a simulation against a reference map 

3. To validate simulated changes against a reference map of changes 
4. To validate a series of maps with two or more time points 

5. To validate a series of maps with two or more time points (vector) 
6. To validate a series of maps with two or more time points (raster) 


Spatial metrics are some of the most popular tools for ana- 
lysing the pattern of categorical maps. Using the wide 
diversity of spatial metrics currently available, we can obtain 
numerous quantitative measurements of the fragmentation, 
shape complexity and heterogeneity of the landscape. 

Spatial metrics can be calculated for the whole map or for 
certain specific features. In the case of Land Use Cover 
Change analyses, including LUCC modelling, spatial met- 
rics can be specifically used to characterize the pattern of the 
elements that change. 

Spatial metrics are usually highly dependent on the scale 
of analysis (Símová and Gdulová 2012). Scale refers not 
only to the cartographic scale at which the map was drawn 
but also to its spatial and thematic resolutions. This makes 
them useful for evaluating the impact of changes in the scale 
on the way a landscape is represented on a map. They can 
also be used to assess the impact of resampling categorical 
maps. However, this also makes them very uncertain tools 
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when comparing two or more maps that have different res- 
olutions or were obtained at different scales. In these cases, 
the results must be treated with caution. 

For maps at the same scale, spatial metrics can be used to 
assess to what extent their patterns differ. In other words, 
they assess the relative complexity of their shapes and 
perimeters, the degree to which they are fragmented, or how 
close patches belonging to the same categories are to each 
other. 


QGIS Exercises 


As mentioned earlier, there are a lot of spatial metrics 
available, many of which are highly correlated. Despite this, 
there is a wide range of different metrics that characterize 
map patterns in different ways. 

It would be impossible to present example exercises for 
all the available spatial metrics in the literature, as there 
would be enough material to fill an entire book. This is why, 
in the exercises proposed here, we focus on the metrics most 
commonly used for validating maps or analysing their 
uncertainties. These metrics are also suitable for many other 
exercises that users may typically wish to perform. However, 
they should be aware that other metrics are available which 
may be more suitable or useful in certain specific cases. 


Available tools 


* Raster 
Landscape ecology 
Landscape statistics 
Landscape vector overlay 
* Processing toolbox 
LecoS 
Landscape modifications 
Neighbourhood Analysis (Moving Window) 
Landscape statistics 
Count raster cells 
Landscape-wide statistics 
Patch statistics 
Zonal statistics 
Landscape vector overlay 
Overlay raster metrics (Polygon) 
* Processing toolbox 
SAGA 
Raster analysis 
Pattern analysis 
* Processing toolbox 
SAGA 
Vector polygon tools 
Polygon shape indices 
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Despite the widespread use of spatial metrics, QGIS offers 
few tools for calculating them. For vector maps, we have the 
Polygon shape indices tool, which characterizes the area, 
perimeter and shape compactness of polygons. Metrics are 
calculated for each polygon, i.e. at patch level. 

Of the tools available for raster maps, we highlight two: 
the LecoS plugin (Jung 2016) and the SAGA “Pattern 
analysis" tool. GRASS also provides a suite of tools for 
calculating spatial metrics: r.li tools. However, there are 
certain problems with their integration in the QGIS envi- 
ronment that prevent their normal use. This is why we have 
not considered them as an option for calculating spatial 
metrics in this book. 

The SAGA tool only allows the user to calculate a few 
metrics (relative richness, diversity, dominance, fragmenta- 
tion, number of different classes, centre versus neighbours), 
although these are not amongst the most frequently used 
when comparing map patterns. These metrics can only be 
calculated for the entire landscape or study area and are not 
available at patch or class level. In addition, although the 
user may select the window at which the spatial metrics are 
calculated (3 x 3, 5 x 5 or 7 x 7), the 8-cell neighbour- 
hood rule is applied by default and cannot be changed. 

The “LecoS” plugin offers a wider set of metrics and two 
levels of analysis: per class and for the entire map. It also 
provides a few extra tools with which to manipulate the 
maps and extract specific elements that may be of interest to 
users. The plugin also allows us to calculate the metrics for 
specific areas of the map that overlay a vector layer defined 
by the user. Nonetheless, these spatial metrics cannot be 
calculated per patch and the 8-cell neighbourhood used by 
default for the calculation cannot be changed. For full 
information about the plugin and the various possibilities it 
offers, readers should consult the Lecos website and the 
paper by Jung (2016). 

The R package “landscapemetrics”! provides almost all 
the options currently available for calculating spatial metrics 
(Hesselbarth et al. 2019). It supplies many more metrics than 
the “LecoS” plugin, allows the user to select the neigh- 
bourhood rule and includes the three levels of analysis 
(patch, category or whole landscape). R offers the full 
workflow available through FRAGSTATS (McGarigal 
2015),? a free, very user-friendly software, which is widely 
regarded as the software of reference for calculating spatial 
metrics. 

Although the R package offers us all the options currently 
available for calculating spatial metrics, in this chapter we 
will be focusing exclusively on the LecoS plugin. This is 
because it provides enough tools for the exercises we 


! https://r-spatialecology.github.io/landscapemetrics/ 
? https://www.umass.edu/landeco/research/fragstats/fragstats.html. 
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propose, and is a tested, efficient software which allows us to 
perform these analyses easily and quickly. 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To assess to what extent the pattern of the CORINE map is 
similar to the pattern of the reference SIOSE map, which 
charts the real situation on the ground. 


Materials 


SIOSE Land Use Map Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster. The background class must be 
0 or no data. 


Execution 
Step 1 


One of the requisites of the "LecoS" plugin is that no cat- 
egory, apart from the background, is coded with the number 
0. In our maps, the category “agricultural areas” is coded 0. 
The first step is therefore to reclassify the maps, so the 
background is coded 0 (currently it is coded 12) and all other 
categories have different codes other than 0. 

The maps are reclassified using the Reclassify by table 
tool (Processing toolbox > Raster analysis > Reclassify by 
table). After opening the tool, indicate the map you want to 
reclassify (CORINE map) and fill in the “Reclassification 
table" with the values that will replace the existing values in 
the raster (Fig. 2). When filling in this table, bear in mind 
that the tool will search for values that are less than or equal 
to the maximum and greater than the minimum. In other 
words, if you reclassify as 2 (new value) the values with a 
maximum value of 1 and a minimum value of 0, all the 
pixels with value 1 will be reclassified as 2. 1 is the only 
value greater than 0 that is also less than or equal to 1. 

Bearing these criteria in mind, fill in the reclassification 
table and run the tool (Fig. 3). 
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Step 2 


After running the tool, you will obtain a reclassified map that 
meets the requirements of the LecoS plugin. You are now in 
a position to calculate the spatial metrics. This is done by 
accessing the Landscape statistics option of the “LecoS” 
plugin via the following route: Raster » Landscape ecol- 
ogy » Landscape statistics. 

Once there, in the “Landcover grid" box indicate the 
raster for which you want to calculate the spatial metrics 
(CORINE reclassified), the “No-data” value (0, which is the 
background) and the spatial resolution of the raster (50 m, 
which you can check in the layer properties). You must also 
select the particular metrics you want to obtain (Fig. 4). 

Several spatial metrics can be selected at the same time, 
using the "Select multiple metrics" tab. In this case, we 
selected the following: Land cover; Landscape proportion; 
Number of patches; Greatest patch area; Smallest patch area; 
Mean patch area; Median patch area; Fractal dimension 
index; Like adjacencies; Patch cohesion index. Once you 
have done this, run the function. 

If your computer is unable to calculate all the metrics at 
the same time, split the task into two (e.g. two groups of five 
metrics). In this case, after running the tool for the second 
time, the results must be gathered together in a single file, as 
the plugin creates one file for each time you run the tool. 


Step 3 


The last step is to repeat the whole workflow for the refer- 
ence raster, i.e. for the SIOSE map. In this case, you will 
probably need to split the spatial metrics calculation into 
different steps as the plugin may be not able to handle all the 
information at once. As the SIOSE map is made up of a 
larger number of patches, the plugin will need more time to 
make all the calculations. 


Results and Comments 


Once the spatial metrics for each of the maps have been 
calculated, the results of the analysis will be stored in CSV 
files in the folder of your choice. 

You will have one file for each time you have run the 
tool. The first step will therefore be to gather all the infor- 
mation together in one file to make it easier to compare the 
spatial metrics for the two maps (Table 1). This is done using 
a spreadsheet program such as OpenOffice Calc or Microsoft 
Excel. Once the results have been correctly organized, you 
can now compare the pattern of the two maps (Tables 1, 2, 3 
and 4). 
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Fig. 2 Exercise 1. Step 1. Table required for the *Reclassify by Table” tool 


The “Land cover" and “Landscape proportion" met- 
rics (Table 1) offer information about the space occupied by 
each category on the map. This gives us an insight into the 
composition of the landscape, i.e. the proportions or areas 
occupied by each category on the map, regardless of exactly 
where these categories are allocated. 

The “Land cover" metric indicates the surface area in 
square metres occupied by each category. The “Landscape 
proportion" gives the proportion of the entire map (out of 1) 
occupied by each category. If the two maps have the same 
extent, both metrics will provide the same information, albeit 
in different units (square metres and percentage). Comparing 
maps with different extents is not recommended and could 
lead to important issues in the interpretation of the analysis. 

In our case, the landscape composition of the two maps is 
very similar. All the categories are represented in similar 


proportions. Nonetheless, some differences were observed in 
the case of mineral extraction sites (Category 5 after 
reclassification), dump sites (Category 6) or road and rail 
networks (Category 7), among others. 

The “Number of patches” (Table 1) indicates how many 
patches (contiguous areas with the same pixel value) make 
up each category. This metric is easy to understand and 
provides us very useful information about how fragmented a 
particular category is, so giving us an insight into the con- 
figuration of the landscape, i.e. about the way each category 
is allocated in the map. 

Unlike landscape composition, important differences can 
be observed between the two maps in terms of landscape 
configuration. The SIOSE map is much more fragmented 
than the CORINE one. This difference is very significant for 
example in the road and rail networks category (Category 7 


210 D. García-Álvarez and M. Paegelow 


Q Reclassify by Table 
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Fig. 3 Exercise 1. Step 1. Reclassify by Table 
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Fig. 4 Exercise 1. Step 2. LecoS plugin 
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after reclassification). Whereas in CORINE this class is 
made up of just 28 patches, in SIOSE it is much more 
fragmented with 2,464 patches (Table 1). 

These differences are to be expected given that the 
CORINE and SIOSE maps use different Minimum Mapping 
Units (MMU) and Minimum Mapping Widths (MMW). 
SIOSE represents homogenous areas covering over 0.5-2 ha 
with a minimum width of 15 m, whereas CORINE only 
shows areas of over 25 ha with a minimum width of 100 m. 
Many small patches that appear in SIOSE do not therefore 
appear on the CORINE map. 

Those land use categories that usually appear on the ground 
in small areas, such as small dump sites, or with linear features 
such as most of the road network, are not represented on the 
CORINE map, although they do appear in SIOSE. This 
explains the differences between the two maps in terms of the 
areas or proportions of certain classes referred to above. 

It would be wrong therefore to conclude that CORINE 
does not map these areas of disagreement between the two 
maps well. They do not appear in CORINE simply because 
it has different MMU and MMW rules. 

The “Greatest patch area" and “Smallest patch area" 
metrics (Table 2) help us to characterize the degree of 
fragmentation referred to earlier. The first metric measures 
the area (in square metres) of the largest patch on the 
map. The second metric does the same for the smallest patch 
on the map. 

These two metrics highlight CORINE's simpler pattern 
and higher level of generalization. With a few exceptions, 
the largest patch in CORINE is usually larger than its 
counterpart in SIOSE. For the smallest patch, there are small 
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differences between the maps. In most cases, the smallest 
patch occupies 2,500 m? in both maps. In other words, the 
smallest patch covers a single pixel with a 50 m edge 
(50 x 50 = 2,500 m?). It does not comply with the MMU 
and MMW rules of CORINE. This may be due to the 
presence of isolated pixels on the edge of the map after 
clipping it or due to the rasterization process. 

The “Mean patch area" and “Median patch area" metrics 
(Table 3) also help us characterize the fragmentation of the 
map. These metrics measure the mean area and the median 
area of all the patches belonging to a particular category. As 
one might imagine, mean and median patch area are always 
smaller for SIOSE than for CORINE because of SIOSE's 
higher fragmentation. This is because the SIOSE map, due to 
its smaller MMU and MMW, draws more small polygons 
than CORINE, which tends to group them together in larger 
polygons. 

The “Fractal dimension index” (Table 4) measures the 
mean shape complexity of the patches that make up each 
category. Values closer to 1 indicate simple geometries, 
more closely resembling a square, whereas values closer to 2 
indicate more complex geometries, which are less like the 
simple shape of a square. 

Contrary to what might be expected, and with the 
exception of the port areas (Category 9 after reclassification), 
patch shapes were more complex in CORINE than SIOSE. 
This seems illogical given that SIOSE is made at a finer scale 
(1:25,000) than CORINE (1:100,000) and delimits land use 
areas more accurately. 

In our case, CORINE has more complex patch shapes 
than SIOSE because of the rasterization of the CORINE and 


Table 1 Results from Exercise 1. Table showing the spatial metrics (Land Cover, Landscape proportion; Number of patches) for each category of 


the two maps that have been analysed (CORINE and SIOSE) 


Land cover (m3) 


CORINE SIOSE CORINE 
1 683,257,500 640,252,500 0.42 
2 609,940,000 625,150,000 0.38 
3 70,275,000 68,785,000 0.04 
4 51,527,500 47,030,000 0.03 
5 7,022,500 8,950,000 0.00 
6 2,185,000 4,740,000 0.00 
7 11,807,500 31,102,500 0.01 
8 4,955,000 5,127,500 0.00 
9 1,892,500 737,500 0.00 
10 9,670,000 13,332,500 0.01 
11 8,592,500 13,030,000 0.01 
12 152,457,500 155,345,000 0.09 


Landscape proportion 


Number of patches 


SIOSE CORINE SIOSE 
0.40 245 971 
0.39 255 1,768 
0.04 92 896 
0.03 61 610 
0.01 15 94 
0.00 5 116 
0.02 28 2,462 
0.00 J 16 
0.00 1 4 
0.01 22 274 
0.01 18 260 
0.10 8 329 
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Table 2 Results from Exercise 1. Table showing the spatial metrics (Greatest patch area; Smallest patch area) for each category of the two maps 


that have been analysed (CORINE and SIOSE) 


Greatest patch area (n2) 


CORINE SIOSE 
1 468,677,500 217,527,500 
2 160,860,000 137,540,000 
3 14,812,500 13,975,000 
4 11,737,500 6,610,000 
5 1,145,000 985,000 
6 930,000 1,062,500 
7 1,602,500 2,027,500 
8 3,480,000 3,340,000 
9 1,892,500 660,000 
10 910,000 710,000 
11 1,965,000 1,767,500 
12 150,047,500 149,600,000 


Smallest patch area (m?) 


CORINE SIOSE 
2,500 2,500 
2,500 2,500 
2,500 2,500 
2,500 2,500 
5,000 2,500 
285,000 2,500 
2,500 2,500 
2,500 2,500 
1,892,500 10,000 
2,500 2,500 
12,500 2,500 
2,500 2,500 


Table 3 Results from Exercise 1. Table showing the spatial metrics (Mean patch area; Median patch area) for each category of the two maps that 


have been analysed (CORINE and SIOSE) 


Mean patch area (m?) 


CORINE SIOSE 
1 2,788,806.12 659,374.36 
2 2,391,921.57 353,591.63 
3 763,858.70 76,768.97 
4 844,713.11 77,098.36 
3 468,166.67 95,212.77 
6 437,000.00 40,862.07 
7 421,696.43 12,633.02 
8 707,857.14 320,468.75 
9 1,892,500.00 184,375.00 
10 439,545.45 48,658.76 
11 477,361.11 50,115.38 
12 19,057,187.50 472,173.25 


SIOSE vector databases, which reduced the complexity of 
the SIOSE polygons, resulting in more regular shapes. 

Finally, “Like adjacencies” and the “Patch cohesion 
index” (Table 4) provide information about the compactness 
of the categories in a map. Values closer to 0 mean than the 
patches belonging to a particular category are very scattered. 
Values closer to 1 (Like adjacencies) or to 10 (Patch cohe- 
sion index) indicate that they are tightly clustered. 

The “Like adjacencies" metric is based on the number of 
adjacencies between pixels, whereas the “Patch cohesion 
index" is obtained by calculating the ratio between the area 
and the perimeter of the patches. This means that although 
they provide information on a similar subject (compactness), 
they complement each other. 


Median patch area (m2) 


CORINE SIOSE 
252,500 37,500 
192,500 10,000 
271,250 15,000 
385,000 20,000 
375,000 45,000 
322,500 20,000 
332,500 2500 

252,500 48,750 
1,892,500 33,750 
371,250 20,000 
390,000 10,000 
450,000 5000 


These metrics show that land uses are represented in a 
more compact (more clustered) manner in the CORINE 
database. This makes sense because of the lower degree of 
fragmentation and the greater generalization of CORINE 
compared to SIOSE. 

All in all, even if important differences between the two 
maps could be identified in terms of landscape configuration, 
most of these are due to the different criteria used in the 
drawing of each map. This also applies to the small differ- 
ences in terms of landscape composition. Our CORINE map 
must therefore be considered validated after comparison 
with SIOSE. 

However, in order to be able to validate CORINE with 
certainty and to interpret the results of the spatial metrics 


Spatial Metrics to Validate Land Use Cover Maps 


Table 4 Results from Exercise 
1. Table showing the spatial 


Fractal dimension index 


metrics (Fractal dimension CORINE SIOSE 
index; Like adjacencies; Patch 1 1.10 1.09 
eria ee 
analysed (CORINE and SIOSE) 3 1.11 1.07 
4 1.10 1.06 
5 1.11 1.08 
6 1.11 1.06 
7 1.13 1.03 
8 1.11 1.09 
9 1.09 1.12 
10 1.11 1.06 
11 1.13 1.07 
12 1.08 1.06 


more effectively, we should always compare the maps via 
visual inspection. In this case, visual inspection reveals that 
the differences identified by the spatial metrics are mostly 
due to the different criteria used in the drawing of each map, 
and not because they interpret land use in different ways. 
Complementary tools must therefore be used to contextual- 
ize the results of our validation or uncertainty analysis. If 
this is not done, there is a high chance that we will make 
incorrect assumptions due to not having all the relevant 
information. 


Exercise 2. To validate a simulation against a ref- 
erence map 


Aim 


To assess to what extent the pattern of our simulation is 
similar to the pattern of a reference map for the same year, 
which accurately reflects the real situation on the ground. 


Materials 


Simulation CORINE Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster. The background class must be 
0 or no data. 

For a proper validation, the reference map and the simula- 
tion must refer to the same year. 
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Like adjacencies Patch cohesion index 


CORINE SIOSE CORINE SIOSE 
0.85 0.75 9.96 9.93 
0.84 0.75 9.93 9.91 
0.78 0.58 9.77 9.71 
0.79 0.63 9.72 9.53 
0.73 0.62 9.37 9.03 
0.73 0.52 9.32 8.98 
0.66 0.28 9.43 9.20 
0.78 0.74 9.66 9.64 
0.87 0.52 9.63 9.34 
0.72 0.55 9.33 8.80 
0.71 0.54 9.43 9.23 
0.96 0.93 9.95 9.95 

Execution 

Step 1 


In order to comply with the requirements of the “LecoS” 
plugin, which assumes that pixels with the value 0 are No 
Data or background, we must first reclassify the two maps 
we are going to compare. The background, which is coded 
as 12, must be reclassified as 0. Agricultural areas, which 
were coded as 0, must be reclassified as 1. All the remaining 
classes must be reclassified following the same criteria (new 
code = original code + 1). 

The Reclassify by table (Processing toolbox » Raster 
analysis > Reclassify by table) tool will be used to reclassify 
the maps (Fig. 5). First, indicate the map you want to 
reclassify and, then, fill in the “Reclassification table" with 
the new category codes that will replace the existing ones in 
the raster. 


Step 2 


Once the two maps have been reclassified, the next stage is 
to calculate the spatial metrics for each map: first for the 
simulation and then for the reference map. This is done using 
the Landscape statistics option in the "LecoS" plugin 
(Raster » Landscape ecology » Landscape statistics) 
(Fig. 6). 

In *Landcover grid" select the raster for which you want to 
obtain the spatial metrics. You must also indicate the value of 
the background (No-data) and its spatial resolution (Cellsize). 
Finally, select the spatial metrics you are going to calculate. 

Several spatial metrics can be selected at the same time 
using the "Select multiple metrics" tab. In this case, we 
selected the following metrics: Land cover; Landscape 
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Q Reclassify by Table 


Parameters Log 


Raster layer 

| Simulation_C11 [EPSG:32630] v] [še] 
Band number 

Band 1 (Gray) v 
Redassification table 

Fixed table (13x3) [aq | 


> Advanced parameters 


[V] Open output file after running algorithm 


Fig. 5 Exercise 2. Step 1. Reclassify by Table 


ix Landcover Analysis x 
Landscape Statistics 
Landcover grid Nodtalo —  à— $0 5 ] 
[Smultion C11 redassfed v. Celsize:[s0 E 
O Direct value output 


(6) Save results as csv (+++ |]_QGIS/Spatial_metrics/Spatial_metrics_Simularion_C11.csv | 


Metric list Joblist 

Edge length Land cover 

Edge density Landscape Proportion 

Patch density Number of Patches 

Largest Patch Index 

Euclidean Nearest-Neighbo: Smallest patch area 

Mean patch shape ratio Mean patch area 

Overall Core area Median patch area 

Landscape division Fractal Dimension Index 

Effective Meshsize Like adjacencies 

< > Patch cohesion index 
[O Select all/none 


Fig. 6 Exercise 2. Step 2. LecoS plugin 
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proportion; Number of patches; Greatest patch area; Smallest 
patch area; Mean patch area; Median patch area; Fractal 
dimension index; Like adjacencies; Patch cohesion index. 


Results and Comments 


Once the spatial metrics for each of the maps have been 
calculated, the results of the analysis will be stored in CSV 
files in the folder of your choice. To make it easier to 
interpret and compare the spatial metrics, the two files must 
be merged into one. This can be done using a spreadsheet 
program such as OpenOffice Calc or Microsoft Excel. This 
will display the results in a table similar to Table 5. 

At first sight, the differences between the patterns of the 
two maps do not seem very significant. This makes sense in 
that we are calculating the metrics for the whole area of the 
maps. However, land use changes only affect a small portion 
of maps, usually less than 10% or even 5% of the studied 
areas. The changes we simulated or that actually happened 
on the ground according to the reference map will not 
therefore have a dramatic impact on the spatial metrics for 
the whole map. 

Even so, some differences can be observed. Agricultural 
areas (Category 1 after reclassification) and vegetation areas 
(Category 2) are made up of a larger number of patches in 
the simulation than in the reference map (Table 5). By 
contrast, urban fabric (Category 3) and industrial and com- 
mercial areas (Category 4) are made up of a slightly smaller 
number of patches. 
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These trends may indicate that the changes simulated as 
transitions to urban fabric and to industrial and commercial 
areas have made these classes more compact (patches that 
were not previously connected have now become connected 
with the simulated changes). That is, these classes did not 
grow in an isolated way, but via the expansion of previously 
existing patches. The slight differences between the refer- 
ence map and the simulation in the “Like adjacencies" and 
“Patch cohesion index" metrics for industrial and commer- 
cial areas (Category 4 after reclassification) also point in this 
direction. 

In the process of expansion of urban fabric and industrial 
areas, some patches of agricultural and vegetation areas 
could become isolated, so increasing the fragmentation of 
the category. This would explain why there are more patches 
in these categories in the simulation than in the reference 
map. 

The difference in pattern between the simulation and the 
reference map can best be calculated using spreadsheet 
software, as described in the example for Table 6. In this 
table, we have subtracted the spatial metric for each category 
in the simulation from the value for the same metric in the 
reference map. Thus, for instance, the reference map has 
602,500 m? more agricultural areas (Category 1) than the 
simulation. By contrast, the simulation has 1,205,000 m? 
more vegetation areas (Category 2) than the reference 
map. In our simulation, more space is also allocated to urban 
fabric (Category 3) and industrial and commercial areas 
(Category 4) than in the reference map. 


Table 5 Results from Exercise 2. Table showing the spatial metrics (Number of patches; Like adjacencies; Patch cohesion index) for each 


category of the simulation and the reference map 


Number of patches 


Reference Simulation Reference 
1 245 288 0.85 
2 255 259 0.84 
3 92 89 0.78 
4 61 58 0.79 
5 15 24 0.73 
6 5 5 0.73 
7 28 27 0.66 
8 7 7 0.78 
9 1 1 0.87 
10 22 22 0.72 
11 18 17 0.71 
12 8 8 0.96 


Like adjacencies 


Patch cohesion index 


Simulation Reference Simulation 
0.85 9.96 9.96 
0.84 9.93 9.93 
0.78 9.77 9.77 
0.80 9.72 9.73 
0.73 9.37 9.33 
0.71 9.32 9.23 
0.65 9.43 9.43 
0.73 9.66 9.52 
0.87 9.63 9.63 
0.72 9.33 9.31 
0.71 9.43 9.43 
0.96 9.95 9.95 
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Table 6 Results from Exercise 2. Difference in the value of the spatial metrics (Land Cover; Greatest patch area; Mean patch area) calculated for 
the simulation and the reference map. The results on the table indicate how far or close are the values of the spatial metrics in the two maps 


Land cover (m2) 


Dif Simulation — Ref map 


1 —602,500 347,500 

2 1,205,000 —150,000 
3 182,500 177,500 
4 155,000 —45,000 
5 857,500 —2770,000 
6 —300,000 —242,500 
7 —927,500 32,500 

8 —1,642,500 —1,642,500 
9 0 0 

10 —340,000 0 

11 —450,000 0 

12 1,862,500 1,857,500 


These differences in the total area allocated to each cat- 
egory help us understand how the model calculates the 
changes it simulates. If the model had simulated the same 
amount of change that actually occurred on the maps, no 
differences would be noticed. 

In our simulation, we did not actively model the vacant 
classes. Thus, whereas according to the reference map there 
were many vegetation areas that changed to agricultural 
areas, in our simulation this did not happen. As a conse- 
quence, our simulation has more vegetation areas, but less 
agricultural areas than the reference map. 

The “Greatest patch area" metric shows that we did not 
model one of the biggest industrial developments in the 
study area correctly. The largest patch in our simulation is 
450,000 m? smaller than the one in the reference map. The 
opposite was true in the case of urban fabric. According to 
the model, many pixels were considered to have changed as 
a result of the expansion of large pre-existing patches, when 
this trend was in fact not that strong according to the ref- 
erence map. 

If we focus on the “Mean patch area" metric for the two 
categories we modelled actively (3 and 4) we can see how in 
both cases the mean area of patches is always bigger in the 
simulation than in the reference map. This may be due to the 
same process as in urban fabric, i.e. most of the changes are 
simulated as expansions of pre-existing large patches. 

In all other categories apart from the first 4 (1, 2, 3, 4), 
there are important differences between the two maps. 
However, as changes in these categories were not modelled 
in the simulation (they remained invariant), the differences 
between the maps are due to changes that took place in the 
reference map but were not simulated. 


Greatest patch area (m?) 


Dif Simulation — Ref map 


Mean patch area 
Dif Simulation — Ref map 
—418,476.26 
—32,288.37 
27,798.60 
46,364.48 
—139,833.34 
—60,000.00 
—18,733.47 
—234,642.85 
0.00 

—15,454.54 
1609.48 
232,812.50 


To sum up, it is difficult with the information available to 
us to understand whether the pattern of the changes we 
simulated is valid or not. We have various clues about the 
pattern of the changes (more compact and connected than in 
the reference map), but these trends are best confirmed by 
visual inspection. Calculating the spatial metrics solely for 
the areas that changed is also highly recommended and can 
provide additional insight. 


Exercise 3. To validate simulated changes against a 
reference map of changes 


Aim 


To assess to what extent the pattern of the changes we 
simulated is similar to the pattern of a reference map of 
changes for the same year, which accurately reflects the real 
situation on the ground. 


Materials 

CORINE Land Use Changes Asturias Central Area 2005- 
2011 

Simulated CORINE changes Asturias Central Area 2005— 
2011 


Requisites 


The two maps must be raster. The background class must be 
0 or no data. 
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i3 Landcover Analysis x 
Landscape Statistics 
Landover wem i ] 
Simulated changes CORINE 05 11 v Cesze|o ë ë ë ë ë ë ë A 
Q Direct value output 
@ Save results as csv ] E/Spatial, metrics. Simulated, changes CORINE.csv 
Calculate Metric Select multiple Metrics Landscape Metrics 
Metric list Joblist 
Landscape Proportion ^ Land cover 
Edge length Number of Patches 
Edge density Greatest patch area 
Patch density Smallest patch area 
Largest Patch Index Mean patch area 
Euclidean Nearest-Neighbo: z Median patch area 
Mean patch shape ratio Fractal Dimension Index 
Overall Core area Like adjacencies 
Landscape division Y Patch cohesion index 
< > 
C] Select all/none 11 C] Select all/none 9 


Fig. 7 Exercise 3. Step 1. LecoS plugin 


For a proper validation, the changes in the reference map 
must refer to the same time period as the simulation period. 


Execution 
Step 1 


Given that the background is already coded O in the two 
maps charting changes, we do not need to take any pre- 
liminary steps prior to calculating the spatial metrics. This 
can be done directly using the Landscape statistics option in 
the *LecoS" plugin (Raster » Landscape ecology » Land- 
scape statistics). 

In the tool, we must indicate the raster for which we want 
to calculate the spatial metrics (Landcover grid), the value of 
the background in our maps (No-Data) and their spatial 
resolution (Fig. 7). Several spatial metrics can be selected at 
the same time, using the "Select multiple metrics" tab. 


© About 


In this analysis, we will be calculating the following 
metrics: Land cover; Number of patches; Greatest patch 
area; Smallest patch area; Mean patch area; Median patch 
area; Fractal dimension index; Like adjacencies; Patch 
cohesion index. 


Step 2 
We repeat this process for the second map. 
Results and Comments 


Once we have run the tool twice, once for each map, we will 
have two CSV files with the metrics for each of the change 
maps. These will be saved in the specified folder. 

The reference map of changes includes land use changes 
for many categories (1, 2, 5, 6, 7, 8, 10 and 11) that are not 
drawn on the simulated map of changes. This is because we 
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only actively simulated urban fabric (Category 3) and 
industrial and commercial areas (4). The rest of the cate- 
gories were only simulated passively (1, 2) or remained 
invariant during the simulation (5, 6, 7, 8, 9, 10, 11, 12). As 
a result, the map of simulated changes only includes patches 
from Categories 3 and 4 (urban fabric, industrial and com- 
mercial areas). We will therefore only compare the spatial 
metrics for these categories. 

The changes we simulated are quantitatively the same as 
the reference changes (Table 7). We can therefore say that 
our model correctly predicted the quantity of changes that 
happened in our study area. 

On the other hand, the pattern of the simulated changes 
seems to be very different from the pattern of the reference 
map of changes. In the reference map, the changes took 
place in just a few patches and most of the pixels that 
changed are allocated close to each other. In the simulation, 
the changes are fragmented in many different patches 
(Table 7). The “Mean patch area" and “Median patch area” 
metrics confirm this trend (Table 8). The simulated changes 
take place in very small patches, made up of just a few 
pixels. 

When working with Cellular Automata models, change 
usually takes place organically as an expansion of existing 
patches. In the real world, however, changes in urban and 
industrial areas tend to happen at the same time over entire 
cadastral parcels. Often, these parcels are quite big, com- 
prising a large number of pixels. However, as CA models 
usually simulate change at the pixel level, they are not 
normally capable of simulating big patches of change 
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covering large numbers of pixels. Our model therefore 
behaves differently from the real processes taking place on 
the ground, hence the disagreements in the pattern of sim- 
ulated changes. 

Other metrics, such as “Like adjacencies" and “Patch 
cohesion index" confirm this behaviour. The pixels in the 
reference map are better grouped than those in the simulated 
map (Table 9). This is also manifested by the “Greatest patch 
area" metric (Table 7). The largest patch is always much 
bigger in the reference map of changes than in the 
simulation. 

In conclusion, the pattern of changes we simulated is very 
different to the pattern of changes in the reference 
map. However, this does not mean that the changes we 
simulated have altered the pattern of the simulated land- 
scape. On the contrary, as we discovered in the previous 
exercise, the pattern of the whole landscape remains very 
similar. 

It is important to remember here that we are only calcu- 
lating the pattern of the areas that changed, without viewing 
them in any larger context. By contrast, when we calculate 
the spatial metrics for the whole map, we also consider the 
context and can therefore assess whether the changes have 
altered the pattern of the map. Thus, both analyses are 
complementary. We recommend users to carry out both 
analyses when validating the pattern of their simulations. 

Finally, a qualitative validation through visual inspection 
is highly recommended for contextualizing the results and 
understanding them better. 


Table 7 Results from Exercise 3. Table showing the spatial metrics (Land Cover; Number of patches; Greatest patch area) for each actively 


simulated category in the simulation and the reference map 


Land cover (n?) 


Reference Simulation Reference 
3 2,280,000 2,280,000 20 
4 1,800,000 1,800,000 24 


LNumber of patches (n2) 


LGreatest patch area (m?) 


Simulation Reference Simulation 
121 572,500 190,000 
172 295,000 182,500 


Table 8 Results from Exercise 3. Table showing the spatial metrics (Mean patch area; Median patch area) for each actively simulated category 


in the simulation and the reference map 


Mean patch area (n2) 


Reference Simulation 
3 114,000 18,842.97 
4 75,000 10,465.12 


Median patch area (n2) 


Reference Simulation 
40,000 5,000 
55,000 2,500 


Table 9 Results from Exercise 3. Table showing the spatial metrics (Fractal dimension index; Like adjacencies; Patch cohesion index) for each 


actively simulated category in the simulation and the reference map 


Fractal dimension index Like adjacencies 


Reference Simulation Reference 
3 1.07 1.04 0.63 
4 1.09 1.03 0.53 


Patch cohesion index 


Simulation Reference Simulation 
0.37 9.11 8.26 
0.30 8.67 8.24 
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Exercise 4. To validate a series of maps with two or 
more time points 


Aim 


To test the consistency of the pattern of land uses in a series 
of LUC maps made up of two different time points. 


Materials 


CORINE Land Use Map Asturias Central Area 2005 v.0 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster. The background class must be 
0 or no data. 


Execution 
Step 1 
In order to comply with the requirements of the “LecoS” 


plugin, the maps must be reclassified to ensure that the 
background code is 0 and all other categories have a positive 


Q Reclassify by Table 


Parameters Log 
Raster layer 

a" C05, v00 [EPSG:32630] 
Band number 

Band 1 (Gray) 
Redassification table 


> Advanced parameters 
Redassified raster 
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code different from 0. This is done using the Reclassify by 
table tool (Processing toolbox » Raster analysis » Reclas- 
sify by table) (Fig. 8). 

After opening the tool, we indicate the map we want to 
reclassify and then fill in the “Reclassification table" with the 
new category codes that will replace the existing ones in the 
raster (Fig. 9). 


Step 2 


Once the categories have been reclassified, the spatial met- 
rics for each map can be calculated using the Landscape 
statistics option in the *LecoS" plugin (Raster » Landscape 
ecology > Landscape statistics) (Fig. 10). 

After opening the tool, we select the raster for which we 
wish to obtain the metrics (Landcover grid), the background 
value of the raster (No-Data) and its spatial resolution 
(cellsize). We then select the different metrics we want to 
calculate in the “Select multiple metrics" tab. In this case we 
selected the following: Land cover; Landscape proportion; 
Number of patches; Greatest patch area; Smallest patch area; 
Mean patch area; Median patch area; Fractal dimension 
index; Like adjacencies; Patch cohesion index. 


Results and Comments 


After running the tool, the metrics are displayed in two CSV 
files which are saved in the specified folder. 


|C:/Users/David/Dropbox/Manual, QGIS/Spatial metrics/CO5 v00 reclassified.rst 


[7] Open output file after running algorithm 


Fig. 8 Exercise 4. Step 1. Reclassify by Table 


220 


Q Fixed table 
Minimum cem 

pom 

d EE | 

3 1 ; 

4 2 " 

5 3 : 

6 4 f 

7 |5 i 

8 6 = 

9 7 : 

10 8 : 

He 10 

a 11 

13 11 - 


D. García-Álvarez and M. Paegelow 


X 


Value Add Row 
1 Remove Row(s) 
2 Remove All 


Ca] 


4 Cancel 


11 


12 


Fig. 9 Exercise 4. Step 1. Table required for the “Reclassify by Table” tool 


The metrics reveal important differences between the two 
maps in terms of landscape configuration, i.e. the way land 
uses are allocated on each map. 

The categories in the CORINE 2011 map are made up of 
many more patches than the same categories in the CORINE 
2005 map (Table 10). In some cases, such as urban fabric 
(Category 3 after reclassification), there are twice as many 
patches in the CORINE 2011 map (92) as in the CORINE 
2005 map (44). 

The “Like adjacencies” and “Patch cohesion index” 
metrics also show slight differences between the maps. This 
is unusual when comparing a time series of land use maps, 
as these metrics are not usually sensitive to small changes in 
the landscape. With the exception of highly dynamic envi- 
ronments, in most of the study areas we might wish to 
assess, change affects less than 5% of the landscape. We 
should not therefore expect meaningful differences in the 


spatial metrics that characterize the landscape over a short 
period such as that used in our example (2005-2011). 

The “Land cover” metrics show big differences between 
the maps in terms of the areas covered by each category 
(Table 11). One would not expect the composition of the 
landscape to change so much in just 6 years. Agricultural 
areas occupy 28,710,000 m? more in the CORINE 2005 map 
than in the 2011 one. That means that 11,484 pixels changed 
over the 6-year period. However, a process of change of 
such magnitude was not observed on the ground in the study 
area. 

The “Greatest patch area" and “Mean patch area" metrics 
also differ greatly for the two maps in the time series 
(Table 11). These differences are also much bigger than 
might be expected due to changes in the landscape over the 
timeframe analysed. 
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3 Landcover Analysis x 
Landscape Statistics 
Landcover grid Nodatao — 0 0 0 1] 
(COS v00 redassfed — v. Celsize:[50 — 1 15] 


Q Direct value output 


(€) Save results as csv |... lox/Manual, QGIS/Spatial, metrics/Spatial metrics, COS v00.csv 


Calculate Metric Select multiple Metrics Landscape Metrics 


Metric list Joblist 


Edge length Land cover 

Edge density Landscape Proportion 
Patch density Number of Patches 
Largest Patch Index 

Euclidean Nearest-Neighboi Smallest patch area 


Mean patch shape ratio n Mean patch area 
Overall Core area - Median patch area 
Landscape division Fractal Dimension Index 
Effective Meshsize Like adjacencies 

Patch cohesion index 


Fig. 10 Exercise 4. Step 2. LecoS plugin 


Table 10 Results from Exercise 4. Table showing the spatial metrics (Number of patches; Like adjacencies; Patch cohesion index) for each 
category of the two maps that have been analysed (CORINE 2005 and CORINE 2011) 


Number of patches Like adjacencies Patch cohesion index 

C05 Cll C05 Cll C05 Cll 
1 126 245 0.87 0.85 9.96 9.96 
2 173 255 0.86 0.84 9.91 9.93 
3 44 92 0.85 0.78 9.81 9.77 
4 33 61 0.84 0.79 9.73 9.72 
5 14 15 0.78 0.73 9.39 9.37 
6 2 5 0.79 0.73 9.45 9.32 
7 15 28 0.67 0.66 9.35 9.43 
8 3 7 0.80 0.78 9.60 9.66 
9 1 1 0.88 0.87 9.58 9.63 
10 12 22 0.77 0.72 9.26 9.33 
11 12 18 0.71 0.71 9.27 9.43 
12 5 8 0.96 0.96 9.95 9.95 
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Table 11 Results from Exercise 4. Difference in the value of the spatial metrics (Land Cover; Greatest patch area; Mean patch area) calculated 
for the two maps that have been analysed (CORINE 2005 and CORINE 2011). The results on the table indicate how far or close are the values of 


the spatial metrics in the two maps 


Land cover (m2) 


Greatest patch area (n2) 


Mean patch area (n2) 


Dif C11 — C05 Dif C11 — C05 Dif C11 — C05 
1 —28,710,000 7,802,500 —2,861,729.59 
2 9,125,000 12,570,000 —1,080,997.50 
3 3,430,000 —4,810,000 —155,345.84 
4 580,000 2,482,500 —699,150.53 
5 —1,.272,500 62,500 —124,333.33 
6 722,500 —120,000 —294,250.00 
7 4,760,000 667,500 —48,136.90 
8 1,710,000 1,182,500 —373,809.53 
9 370,000 370,000 370,000.00 
10 4,882,500 —1500 40,587.12 
11 4,695,000 1,027,500 152,569.44 
12 —292,500 —421,500 —11,492,812.50 


These results indicate that there are many differences 
between the two maps that are not due to real changes in the 
landscape. These differences may be due to technical issues 
within the time series in that different methods were used to 
produce CORINE 2005 and 2011. 

These conclusions were confirmed by a visual inspection 
of the two maps, an additional check that is highly recom- 
mended to complement the results of this analysis. 


Exercise 5. To validate a series of maps with two or 
more time points (vector) 


Aim 

To study the pattern of a specific transition (from scrubland 
to forest) in our study area (Ariége Valley) for a given period 
(2000-2018). 


Materials 


CORINE Land Cover Map Val d’Ariége 2000 
CORINE Land Cover Map Val d’Ariége 2018 


Requisites 


All raster maps must have the same resolution, extent and 
projection. 


Execution 
Step 1 


We begin by extracting the changes we want to study 
(transition from scrub to forest) with the Raster Calculator 
(Fig. 11). In the raster calculator expression box, we write an 
expression to obtain a map with the features that were scrub 
in 2000 (Category 4) and forest in 2018 (Category 3): 
“CLC_2000@1” = 4 AND “CLC_2018@1” = 3. 

This produces a raster showing the areas that underwent 
this transition (Fig. 12). 


Step 2 


Once the raster for this transition has been obtained, it must be 
converted into vector format (polygons) using the Polygonize 
GDAL tool. When making this conversion, the “Use 
8-connectedness” option must be selected (Fig. 13). In this 
way, the tool considers all pixels diagonal to other pixels as part 
of the same polygon. If this option is not selected, pixels situated 
diagonal to other pixels are considered as separate polygons. 


Step 3 


Once the polygons that undergo this transition have been 
obtained in vector format, we can then calculate their spatial 
metrics using the SAGA Polygon Shape Indices tool 
(Fig. 14). 
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Q Raster Calculator x 
Raster Bands Result Layer 
2018_UTM@1 = Output layer \Urba_Ariege\2000scrubs_2018forest €3 
2018_built-up_com_errors@1 
2018_built-up_com_errors_dist@1 Output format GeoTIFF 
2018_window@1 
CLC_2000@1 Selected Layer Extent 
CLC_2012@1 A a 
CLC 201 801 X min 570087,00000 - X max 600672,00000 
CLC 2018 built-up dist&1 - a 
CLC_2018 built-up@1 Y min 6184310,00000 + Y max | 6244970,00000 
CLC_predict_2018_built-up@1 Columns | 2039 — Rows | 4044 
CLC_predict_2018@1 
CLC_predict_2018_soft@1 Output CRS EPSG:2154 - RGF93 / Lambert-93 -| 


OCS_2013_CLC@1 


dem@1 v V. Add result to project 
v Operators 
+ x sqrt cos sin tan log10 ( 
/ ^ acos asin atan In ) 
« > = l= z= >= AND OR 
Raster Calculator Expression 


"CLC 200001" = 4 AND "CLC 201861" = 3 


Expression valid 


Fig. 11 Exercise 5. Step 1. Raster calculator 


After calculating the spatial metrics, we obtain a vector 
file. The values for these metrics are calculated for each 
polygon and are stored in the attribute table of the vector 
(Fig. 15). The metrics used in this case were: area; perime- 
ter; ratio perimeter / area; ratio perimeter / square root area; 
maximum distance; maximum distance / area; maximum 
distance / square root area; and shape index. 


Step 4 


In order to better interpret the general pattern of all the 
polygons that undergo this transition, the results of the 
metrics can be exported to a spreadsheet where statistics 
such as the mean, standard deviation, minimum and maxi- 
mum can be calculated (Table 12). 


c | 


Results and Comments 


The pattern of the areas that transition from scrubland (2000) 
to forest (2018) is very diverse, with patches of varying size, 
capacity and shape complexity. The smallest polygon covers 
only 224 m?, while the largest occupies 2,462,094.65 m?. 
Perimeter lengths also vary enormously: from almost 60 m 
to 13,155.37 m. These results indicate that the areas that 
transition from scrubland to forests have very different sizes 
and shapes. 

The perimeter / area (P/A) ratio is a measure of the 
compactness of the patches. Lower P/A values mean more 
compact polygons, whereas higher P/A values mean elon- 
gated or less compact polygons. The maximum distance 
metric indicates the longest segment of a polygon. The 
maximum distance / area (D/A) ratio is a measure of how 
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E Transition areas from scrub to forest 
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[©] Agriculture 
Gl Forest 
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E Mineral 
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0 10000 m 


Fig. 12 Intermediate output from Exercise 5. Map showing areas that transition from scrub to forest 


elongated the polygon is. Lower values indicate more 
compact, less elongated polygons, whereas higher values Exercise 6. To validate a series of maps with two or 
mean the opposite. Finally, the shape index measures the more time points (raster) 

shape complexity of a patch, using the following formula: 
Perimeter/(2 * Square Root(PI * Area). 

The metrics calculated in this exercise can be compared 
with the metrics obtained and analysed in Exercise 6 below, 
which carries out the same analysis with raster data. The 
comparison will offer an insight into how data format (vector 
or raster) can affect the results of a pattern analysis. 


Aim 


To study the pattern of a specific transition (scrub into forest) 
in our study area (Ariège Valley) for a given period (2000— 
2018). 


Spatial Metrics to Validate Land Use Cover Maps 
Q Polygonize (Raster to Vector) 


Parameters Log 
Input layer 
[se] 
Band number 
Band 1 (Gray) ~] 
Name of the field to create 


[7] Open output file after running algorithm 


0% Cancel 
Help 


Fig. 13 Exercise 5. Step 2. Polygonize (Raster to Vector) 


Q Polygon Shape Indices 


Parameters | Log 


Shapes 


| C 2000scrubs, to. 2018forest [EPSG:2154]. 
V} Selected features only 
Shape Index 


|C:/Users/paege/Desktop/ D/WORDWIN/projets en cours/Incertitumbre2019/Urba Ariege/Scrubs00-to-forest20 shape.shp 
V| Open output file after running algorithm 


0% 


Run as Batch Process... | 


Fig. 14 Exercise 5. Step 3. Polygon shape indices 


225 


226 


«X scrubs0O to forest20 shape :: Features Total: 38, Filtered: 38, Selected: 0 


e 


/ e EEDA ES. = 
ID area Area Perimeter P/A 
m— 539461,7670000... 540450,0000000... 4320,0000000000 0,0079933389 

1. 729009,8570000.. 730350,0000000... 8490,0000000000 0,0116245636 
1 28078,3350000000 28125,0000000000  1080,0000000000 0,0384000000 
1 4114822770000... 412200,0000000.. 5100,0000000000 0,0123726346 
1. 192015,3280000.. 192375,0000000..  2760,0000000000 0,0143469786 
1. 445339,1440000... 446175,0000000.. 5100,0000000000 0,0114304925 
1. 163496,9500000... 163800,0000000.... 2550,0000000000 0,0155677656 
1 598512,7140000.. 599625,0000000.. 5070,0000000000 0,0084552846 
1. 237149,5310000.. 237600,0000000.. 3900,0000000000 00164141414 
) 1. 224,5720000000  225,0000000000 60,0000000000 0,2666666667 
1 1.131377 7520000... 131625 0000000... 2310 00000000 00175492575 
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= & @ 

P/sqrt(A) Max.Distanc D/A D/sqrt(A) Shape Index 

5,8763274228  1267,9215275000 0,0023460478 1,7247041764 1,6576813607 
9,9344121452  3185,4081371000 0,0043614817 3,7273447921 2,8024459255 
6,4398757752  330,3407331800 0,0117454483 1,9697715604 1,8166554159 
7,9435783298  1166,7261890000 0,0028304857 1,8172511513 2,2408420749 
6,2926672261  798,3890029300 0,0041501703 1,8202885190 1,7751286509 
7,6351497595  1481,4351150000 0,0033203006 2,2178390122 2,1538359816 
6,3006191916 — 774,3545699500 0,0047274394 1,9132993194 1,7773718589 
6,5473882337  1738,2534338000 0,0028989009 2,2447771360 1,8469841205 
8,0009469137  1054,3837062000 0,0044376419 2,1630943744 2,2570254536 
4,0000000000 ^ 21,2132034360 0,0942809042 1,4142135624 1,1283791671 
6 3671163756 7089781378000 00053863426 19541758925 17961303682 


Fig. 15 Results from Exercise 5. QGIS table showing the spatial metrics (Area, Perimeter, Perimeter/Area; Perimeter / Square root of the area; 
Maximum distance; Distance / Area; Distance / Square root of the area; Shape index) for each transition area (polygon) of the analysed map 


Table 12 Results from Exercise 5. Table showing the mean, standard deviation, minimum and maximum of the spatial metrics (Area, Perimeter, 
Perimeter/Area; Perimeter / Square root of the area; Maximum distance; Distance / Area; Distance / Square root of the area; Shape index) 


calculated for the areas that underwent the scrub to forest transition 


Area (n?) Perimeter (m) P/A P/sqrt(A) Max.Distanc D/A D/sqrt(A) Shape Index 
Mean 236,950.95 2,625.73 0.10 6.29 759.01 0.03 1.96 1.77 
Standard Dev 427,332.78 2,769.54 0.11 1.74 758.96 0.04 0.56 0.49 
Min 224.50 59.93 0.01 4.00 21.19 0.00 1.41 1.13 
Max 2,462,094.65 13,155.37 0.27 10.00 3,200.44 0.09 3.75 2.82 
Materials Step 2 


CORINE Land Cover Map Val d'Ariége 2000 
CORINE Land Cover Map Val d’Ariége 2018 


Requisites 


All maps must be rasters and have the same resolution, 
extent and projection. 


Execution 
Step 1 


We begin by extracting the specific changes we want to 
study from our series of maps, i.e. the pixels that transitioned 
from scrub (Category 4) to forest (Category 3). We do this 
by introducing the following expression in the Raster Cal- 
culator: “CLC_2000@1” 24 AND “CLC_2018@1” = 3 
(Fig. 11). 


Once the raster with the areas that changed from scrub to 
forest has been obtained, we then calculate their spatial 
metrics using the Landscape statistics option from the 
“LecoS” plugin (Raster > Landscape ecology > Landscape 
statistics) (Fig. 16). After opening the tool, we must select 
the raster layer to be analysed, the output folder where the 
results will be saved and the metrics we want to calculate 
(Fig. 17). To choose the metrics, we must select the “Mul- 
tiple metrics” tab. In this case, we selected 14 different 
metrics: Landscape Proportion, Edge length, Edge density, 
Number of Patches, Patch density, Greatest patch area, 
Smallest patch area, Mean patch area, Median patch area, 
Fractal Dimension Index, Mean patch shape ratio, Land- 
scape Division, Patch cohesion index and Splitting index. 


Results and Comments 


Once the spatial metrics have been calculated, the plugin 
creates a CSV file in the output folder with the results. 
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Landscape Ecology Landscape statistics 


&, Landscape vector overlay 
7 Landscape modifications 


Fig. 16 Exercise 6. Step 2. Landscape statistics option of the LecoS plugin 


iB Landcover Analysis X 
Landscape Statistics 
Landcover grid No-data:5234663852886e 438. 
2000scrubs. 2018forest - Cellsize:| 15 |^ 
Direct value output 
©) Save results as csv ws |  umbre2019/Urba Ariege/Lecos. stats.csv | 


| Calculate Metric | Select multiple Metrics Landscape Metrics 


Metric list Joblist 
Land cover Landscape Proportion = 
Largest Patch Index Edge length 
Euclidean Nearest-Neighbor Dis Edge density 
Overall Core area ——— — ——- Number of Patches 
Like adjacencies 2 Patch density 
Effective Meshsize Greatest patch area 
« Smallest patch area 
—— — — — Mean patch area 
Median patch area 
Fractal Dimension Index 
4 > Mean patch shape ratio - 
Select all/(none 6 Select all/none 14 


onon | oe) Cea 


Fig. 17 Exercise 6. Step 2. LecoS plugin 


Patch density; Greatest patch area; Smallest patch area; Mean patch area; Median patch area; Fractal dimension index; Mean patch shape ratio; 
Landscape division; Patch cohesion index; Splitting index) for the areas that underwent the scrub to forest transition 


Table 13 Results from Exercise 6. Table showing the spatial metrics (Landscape proportion; Edge length; Edge density; Number of patches; 


Landscape Proportion | Edge length | Edge density | Number of Patches | Patch density | Greatest patch area | Smallest patch area 


1 97,260 0.011069 37 0 2,467,575 225 
Mean patch Median patch Fractal Dimension Mean patch shape Landscape Patch cohesion Splitting 
area area Index ratio division index Index 


237,478.378 111,600 1.06899650 1.42935958 0.88744375 9.82890312 225 
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The results show that 37 different patches underwent the 
transition from scrub to forest, as shown in the “Number of 
patches" metric in Table 13. These patches have different 
sizes, varying from 225m” to 2,467,575m°. There are a few 
small patches, but most patches are big, as revealed by the 
mean (237,539m?) and median (111,600m?) metrics. 

The “Landscape proportion" metric indicates the per- 
centage of the studied landscape occupied by the category in 
question. As we are only considering one category in our 
analysis (the areas that transition from scrubland to forests), 
this category occupies 100% of the studied landscape and 
therefore has a landscape proportion value of 1 (Table 13). 
The fractal dimension index informs about the complexity of 
the patches in the specified category. Values closer to 2 
mean more complex shapes, whereas values closer to 1 mean 
simpler shapes. 

The landscape division, patch cohesion and splitting 
indices assess the compactness or fragmentation of the pat- 
ches that make up a class, i.e. how well aggregated they are. 
A "Landscape division" value close to 1 means a very 
fragmented landscape, whereas values close to O indicate a 
landscape made up of a single patch. A “Patch cohesion" 
value of 0 means one isolated patch, whereas values closer to 
100 mean more aggregated patches. A “Splitting index” 
value of 1 indicates a landscape made up of a single patch, 
while splitting index values of more than 1 indicate a pro- 
gressively more fragmented landscape. 

If we compare these results to those obtained in vector 
format (Exercise 5), we can see that the same values were 
obtained for comparable measures (e.g. mean area, greatest / 
smallest area), while other measures use different formulas. 
These include the shape and compacity indices (standardized 
or not, area-weighted or not, completed by a constant or 
not). The LecoS plugin also offers complementary indices 


Open Access This chapter is licensed under the terms of the Creative 
Commons Attribution 4.0 International License (http:// 
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which are not calculated in vector format, such as the fractal 
dimension or the splitting index. In addition, whereas the 
spatial metrics in vector can be calculated individually for 
each patch or polygon (Exercise 5), this is not possible in 
raster format when using the LecoS plugin. The plugin 
usually calculates the mean values of all the patches for each 
metric. 
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Abstract 


In this chapter we explore pattern analysis for categorical 
LUC maps as a means of validating land use cover maps, 
land change and land change simulations. In addition to 
those described in Chap. “Spatial Metrics to Validate Land 
Use Cover Maps", we present three complementary 
methods and techniques: a Goodness of Fit metric to 
measure the agreement between two maps in terms of 
pattern (Map Curves), the focus on changes on pattern 
borders as a method for validating on-border processes and 
a technique quantifying the magnitude of distance error. 
Map Curves (Sect. 1) offers a universal pattern-based index, 
called Goodness of Fit (GOF), which measures the spatial 
concordance between categorical rasters or vector layers. 
Complementary to this pattern validation metric, the 
following Sect. 2 focuses specifically on the changes that 
take place on pattern borders. This enables changes to be 
divided into those that take place on the borders of existing 
features and those that form new, disconnected features. 
Bringing this chapter on landscape patterns to a close, 
Sect. 3 presents a technique for quantifying allocation 
errors in simulation maps and more precisely on the 
minimum distance between the allocation errors in simu- 
lation maps and the nearest patch belonging to the same 
category on the reference map. The comparison between a 
raster-based and a vector-based approach brings us back to 
the differences in measurement inherent in the representa- 
tion of entities in raster and vector mode. These techniques 
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are applied to two datasets. Section 1 uses the Asturias 
Central Area database, where CORINE maps are compared 
to SIOSE maps and simulation outputs. For their part, the 
techniques described in Sects. 2 and 3 are applied to the 
Ariège Valley database. CORINE maps for 2000 and 2018 
are used as reference maps in comparisons with simulated 
land covers. 


Keywords 
Allocation distance error e Change on pattern borders e 
Map Curves * Pattern shape and size indices 


1 Map Curves 
Description 


This is a quantitative method proposed by Hargrove et al. 
(2006) to evaluate the spatial concordance between different 
categorical raster or vector datasets. It calculates the Good- 
ness of Fit (GOF) (Fig. 1), a standard metric that evaluates 
the spatial concordance between the patches of two or more 
rasters or the polygons of two or more vectors. Unlike other 
methods, it does not evaluate spatial agreement at cell level, 
and instead focuses on agreement at patch level in rasters or 
at polygon level in vectors. Consequently, this method is 
independent of spatial resolution. 
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Intersecting polygon or 


Polygon or category 
from map being 
compared (Map 1) 


Fig. 1 Goodness of Fit (GOF) algorithm, where Y^ refers to all the 
polygons or patches in Map 2 intersecting each polygon or patch in 
Map 1; A refers to the area of each polygon or patch in Map 1 that is not 
intersected with polygons or patches in Map 2; B refers to the area of 


GOF values range from 0 to 1. Maximum GOF (1) is 
obtained when there is full overlap between two polygons or 
patches. If there is no overlap, GOF is 0. If overlap affects 
half the area of the polygons or patches, GOF will be 0.5. 

When comparing pairs of maps, the GOF value may vary 
depending on whether the assessed map is evaluated against 
the reference map or the reference map is evaluated against 
the assessed map. Map Curves calculates the GOF values for 
both these operations. It then uses the highest of these two 
GOF values in the comparison. 

GOF values may be obtained either for the whole dataset 
or for the set of patches or polygons that make up each 
category on the map. Although it is technically possible to 
calculate a GOF for each individual polygon or patch, it is 
computationally very demanding and is not normally done. 

Based on the GOF metrics at the category level, the results 
of the map comparison may be expressed in a graph, which 
shows the percentage of the categories in the map that have a 
specific GOF value. For example, if there are 10 categories 
and 2 of these have a GOF value of > 0.8, the graph will show 
that 20% of the categories have GOF values of > 0.8. 


Utility 


Exercises 


1. To validate a map against reference data/map 

2. To validate a simulation against a reference map 

3. To validate simulated changes against a reference map of changes 
4. To validate a series of maps with two or more time points 


Map Curves provides a simple metric for assessing the extent 
to which two datasets share the same spatial structure, i.e. the 
same number and shape of polygons or patches. Unlike many 
other metrics, GOF evaluates the spatial agreement between 


category from reference 


map (Map 2) 


Goodness of Fit (GUF) 


o =- C 
“BFC A+B 


each polygon or patch in Map 2 that is not intersected with polygons or 
patches in Map 1; and C refers to the area of intersection between 
polygons or patches from Maps | and 2 


maps at a polygon or patch level. In most cases, this type of 
analysis is based on raster data and comparisons are made at 
cell level. However, polygons or patches reflect the real 
structure of a landscape better than cells. GOF therefore 
provides a better, more realistic method for validating the 
similarity between maps than cell-based metrics. 

GOF provides a standard and, therefore, comparable 
metric. The GOF value in one validation exercise may be 
compared with the GOF value obtained in another. Conse- 
quently, when using this metric to assess validity, we can 
establish a general minimum acceptable GOF threshold 
above which the map can be considered valid. 

Map Curves gives an overview of the pattern agreement 
for the whole landscape and at category level. However, it 
does not provide information about the agreement per 
polygon. This means that a few polygons that do not show 
good overlap when comparing the maps could be hidden in 
the general analysis. Thus, as currently implemented, this 
technique only provides information on spatial agreement at 
a category level and does not shed light on disagreements 
occurring at more detailed scales of analysis. 

The fact that GOF is unaffected by the spatial resolution 
used in the analysis should be considered an important 
strength, as spatial resolution is one of the main sources of 
uncertainty associated with any validation 
Nonetheless, at very coarse spatial resolutions, the area and 
shape of some polygons and patches can become very dis- 
torted, and this could affect the results of the analysis. 
Therefore, when used with rasters, GOF can be considered 
independent of spatial resolution below a certain threshold. 

We do not recommend validating the spatial structure of a 
map by comparing it with another map obtained at a dif- 
ferent resolution. Changes in spatial resolution or scale will 
always result in changes in the spatial structure of the maps. 


exercise. 


Advanced Pattern Analysis to Validate Land Use Cover Maps 


The results of the analysis will highlight not only the dif- 
ferences between the original maps in the way they represent 
LUC in the landscape, but also the differences produced by 
changes in the spatial resolution. 

Although Map Curves could be a useful tool for com- 
paring the agreement of the spatial pattern between different 
maps, its results must be treated with caution when validating 
the pattern of the maps. This is because Map Curves only 
assesses the degree of overlap between the patches or poly- 
gons belonging to each category in the two maps compared. 
If the overlap is low, the GOF score obtained by Map Curves 
analysis will also be low. However, this only means that their 
classes do not overlap well and does not imply that the two 
maps being compared have completely different patterns. 

Spatial metrics (see Chap. "Spatial Metrics to Validate 
Land Use Cover Maps") are more suitable for validating the 
pattern of the map. Even if there is no spatial overlap, they 
provide objective information about the fragmentation of the 
landscape or the complexity of the polygons/patches, which 
can be used when comparing two maps. Spatial metrics 
therefore allow us to compare pattern agreement between 
maps, even if they do not locate land uses in the same positions. 


QGIS Exercises 


Available tools 


* Processing Toolbox 
R 
Pattern evaluation 
Map Curves raster R script 
Map Curves vector R script 


There is no default tool in QGIS for carrying out Map 
Curves analysis. It is however implemented in R. We have 
developed two R tools for QGIS to perform the Map Curves 
analysis for either raster or vector data. To learn how to 
configure QGIS to work with R scripts, see Chap. “About 
This Book” of this book. This also explains how to install 
the different R scripts required to do some of the exercises 
presented in the book. 

The Map Curves raster script is based on the code 
developed by Professor Emiel van Loon from the University 
of Amsterdam.’ The script provides full Map Curves results. 
These consist of: (1) the GOF value of the analysis, with 
details of the map used as a reference; (ii) the table for the 


! The code is available on the Professor's personal website: https:// 
www.uva.nl/en/profile/l/o/e.e.vanloon/e.e.vanloon.html. 
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GOF between categories; and (iii) the Map Curves graph. 
The R code of the Map Curves raster script also allows us to 
compare raster and vector maps. However, the vector option 
is unstable and does not always produce correct results. Its 
use is therefore not recommended. 

The Map Curves vector script, which can only be 
employed to compare vector maps, is based on the “Sabre” 
R package.” Unlike the previous script, it only provides 
information on the overall GOF between the two maps and 
the map used as a reference when obtaining it. 

The Map Curves raster script provides more information 
than the Map Curves vector script. It is also much faster and 
more efficient. We therefore recommend that this analysis be 
carried out with raster data. 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To check the agreement between the SIOSE and CORINE 
maps, considering SIOSE as a valid reference. We will assess 
to what extent the spatial structure of the CORINE map 
(number of polygons, shape) is similar to the SIOSE map. 


Materials 


SIOSE Land Use Map Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster and have the same projection. 
Although the tool does work with raster maps at different 
extents and with different thematic resolutions, we recom- 
mend comparing rasters with the same or very similar 
extents and thematic resolutions, so as to avoid results that 
may not be particularly meaningful. 


Execution 


If necessary, install the Processing R provider plugin, and 
download the MapCurves_raster.rsx R script into the R 
scripts folder (processing/rscripts). For more details, see 
Chap. “About This Book” of this book. 


? Full details of this R package and the functions it includes can be 
found at: https://cran.r-project.org/web/packages/sabre/index.html. 
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Q Map Curves Raster 


Parameters Log 
Land use map 1 

te" S11 [EPSG:32630] 
Land use map 2 

me" C11 [EPSG:32630] 


GOF 
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C:/Users/David/Dropbox/Manual QGIS/Map Curves/GOF. value.csv € .. 


V. Open output file after running algorithm 
GOF Table 


C:/Users/David/Dropbox/Manual. QGIS/Map. Curves/GOF. table values.csv ||. 


V Open output file after running algorithm 
R Plots [optional] 


C:/Users/David/Dropbox/Manual. QGIS/Map. Curves/GOF. curve graph.html 9|. 


R Console Output [optional] 


C:/Users/David/Dropbox/Manual. QGIS/Map. Curves/Full results Map. Curves.html © |... 


Fig. 2 Exercise 1. Step 1. Map Curves Raster R script 


Step 1 


Open the Map Curves Raster function and fill in the required 
parameters. These are basically the two LUC maps to be 
compared: “Land Use map 1" (SIOSE) and “Land Use map 
2” (CORINE) (Fig. 2). 


Results and Comments 


After running the function, we obtain two tables and one 
graph. All the information, with the exception of the graph, 
will also be displayed in the *Log" window (Fig. 3). 

The GOF value is a measure of the general agreement 
between the two maps being compared. This value ranges 
from 0 to 1, with 0 meaning no agreement and 1 total 
agreement. The GOF value for our comparison (0.54) indi- 
cates that the agreement between the two maps is significant, 
although not very high. The patches of the same categories 
partially overlap. 

The reference map (SRefmap) value informs us as to 
which map was used as the reference when obtaining the 
GOF value. If value “A” is obtained, it means that “Land use 
map 1" was used as the reference map in the comparison. If 
value “B” appears, it means that “Land use map 2” was used. 
Therefore, in our case, a GOF of 0.54 was obtained when 
comparing SIOSE and CORINE and taking CORINE as the 


reference. If SIOSE had been taken as the reference, 
agreement (GOF value) would have been lower. 

The GOF table details the GOF value for agreement per 
category, so providing a measure of how similar the pattern 
for a particular category is in the two maps. It therefore 
answers the following question: to what extent do the pat- 
ches that make up a particular category overlap in the two 
maps being compared? 

In our case, the category that shows the greatest pattern 
agreement between the two maps is water bodies (Category 
11), with a GOF value of 0.968. Agricultural areas (Category 
0; GOF 0.783) and vegetation areas (Category 1; GOF 
0.800) also show high levels of agreement. By contrast, 
agreement between the two maps is very low for road and 
rail networks (Category 6; GOF 0.112). 

If we observe the two maps, most of the agreement and 
disagreement is due to the fact that they follow different 
Minimum Mapping Unit (MMU) and Minimum Mapping 
Width (MMW) criteria. Thus, if a patch is larger than the 
MMU and MMW of both maps, it will be similarly mapped 
in both cases. However, if a patch is drawn in SIOSE, but is 
too small for the MMU and MMW of CORINE, this will 
lead to disagreement between the two maps. 

This explains the results for Category 6 (road and rail 
networks). Whereas many patches representing road and rail 
networks are mapped in SIOSE, most of them are not 
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Parameters Log 
loading required package: sp 
. MAN 
LELI TB 
10 12 34567 8 9 190 11 12 
{O 0.783 0.005 0.001 0.000 0.000 0.00 0. 
|1 0.009 0.800 0.000 0.000 0.000 0.00 O0. 
|2 0.004 0.000 0.489 0.004 0.000 0.00 O. 
|3 0.001 0.000 0.010 0.458 0.000 0.00 O 
|4 0.000 0.001 0.001 0.000 0.433 0.00 0. 
{S 0.000 0.000 0.000 0.001 0.000 0.22 0 
| © 0.006 0.001 0.004 0.022 0.000 0.00 O. 
7 0.000 0.000 0.000 0.001 0.000 0.00 0. 
|8 0.000 0.000 0.000 0.000 0.000 0.00 O. 
|9 0.000 0.000 0.018 0.001 0.000 0.00 O0. 
| 10 0.000 0.002 0.000 0.000 0.000 0.00 
|11 0.000 0.000 0.000 0.000 0.000 0.00 
12 0.000 0.000 0.000 0.000 0.000 0.00 
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000 0.000 0.000 0.000 0.000 0.000 0 
000 0.000 0.001 0.000 0.001 0.000 O 
001 0.000 0.000 0.000 0.000 0.000 O 
-003 0.000 0.000 0.000 0.000 0.000 O0 
000 0.000 0.000 0.000 0.000 0.000 O 
-000 0.000 0.000 0.000 0.000 0.000 0 
112 0.000 0.000 0.000 0.000 0.000 O 
000 0.761 0.000 0.000 0.000 0.000 O 
000 0.000 0.298 0.000 0.000 0.000 O 
000 0.000 0.000 0.261 0.000 0.000 O 
).000 0.000 0.000 0.000 0.375 0.000 O 
0.000 0.000 0.000 0.000 0.000 0.968 QO 
).000 0.000 0.000 0.000 0.000 0.000 1 vi 
Be 


Fig. 3 Results from Exercise 1 displayed in the Log window of the Map Curves Raster script. General GOF value and GOF table 


mapped in CORINE because they are less than 100 m wide 
and therefore do not comply with its MMW criterion 
(Fig. 3). As a result, the agreement for this category in terms 
of overlapping patches is very low. Although in the few 
patches for this category in which the two maps overlap the 
agreement is high, in most cases the SIOSE road and rail 
networks patches do not overlap with patches in CORINE, 
and the agreement is null. Overall, the agreement for this 
category in the two maps is very low, with a GOF of just 
0.112. 

In this exercise, the GOF values for the different cate- 
gories did not indicate a high degree of similarity between 
the category patterns on the two maps. On the contrary, they 
indicated different patterns of fragmentation for each cate- 
gory because of the different MMU and MMW rules applied 
in each map. 

In addition to the overall GOF and the GOF table 
detailing the GOF agreement per category, the Map Curves 
function also produces two extra tables: the $BMC_A2B and 
the $BMC B2A (Fig. 4). 

Unlike the other two tables, these tables are only dis- 
played in the “Log” window and are not stored in any folder. 
For each category, they indicate the category with which it 
shows most agreement (GOF) on the other map. Whereas, 
the information in the first table (BMC. A2B) was obtained 
using map A (Land use map 1) as the reference, the 


information in the second table (5BMC. B2A) was obtained 
using map B (Land use map 2) as the reference. 

When Land use map 1 (SIOSE) was used as the reference 
map, the agricultural areas (category 0) in SIOSE showed 
the best agreement with the agricultural areas (category 0) in 
CORINE. The GOF value was 0.783, which indicates a very 
high overlap between the patches of this category on the two 
maps. 

For Land use map 2 (CORINE), the agricultural areas 
(category 0) showed the best agreement with the agricultural 
areas (category 0) of SIOSE. The GOF value was the same 
as that obtained when SIOSE was used as the reference. In 
this category it therefore makes no difference which map is 
used as the reference map. 

All the categories showed their best agreement with the 
same category on the other map. In other words, agricultural 
areas in Map 1 showed their best agreement with agricultural 
areas in Map 2, and vegetation areas in Map 1 showed their 
best agreement with vegetation areas in Map 2 etc. This 
indicates that the two maps are thematically consistent, i.e. 
the categories are distributed in a similar way in both maps. 

Finally, the last result provided by the Map Curves 
function is the Map Curves graph (Fig. 5), which is stored in 
.png format in the folder specified when running the tool (R 
plots). The graph presents the same information provided in 
the GOF table. It represents the percentage of categories that 
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Fig. 4 Results from Exercise 1 displayed in the Log window of the Map Curves Raster R script. Tables indicating the categories with wich each 
category in the reference map show the highest agreement 


reach or exceed a specific GOF threshold. Thus, all the 
categories (100%) always have a GOF score higher than 0. s] 
However, only around 40% of the categories in this map —— Bas reference 
have a GOF score of over 0.5 and none of the categories K 
show perfect agreement (0% of the categories have a GOF 
score of 1) (Fig. 5). 

The graph provides the GOF scores using either Land use 
map 1 (A) or Land use map 2 (B) as a reference. It is 
therefore a good summary of the pattern agreement between 
the two maps. 

In summary, in this exercise we have noted that although 
the GOF value is not very high, CORINE has a very similar 
pattern to SIOSE. The lower GOF is the result of different 
pattern fragmentation in the two maps: SIOSE maps have 
many small patches that do not appear in CORINE. How- 
ever, if we look at the maps, the polygons from the same 
category usually overlap very well and have a similar pattern 
structure. In addition, thematic agreement, as we noted in the 
$BMC_A2B and $BMC B2A tables, seems to be very high. Fig. 5 Result from Exercise 1. Map Curves graph 
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Exercise 2. To validate a simulation against a ref- 
erence map 


Aim 


To assess the similarity between the spatial structure of a 
simulation and the spatial structure of a map used as a 
reference. 


Materials 


Simulation CORINE Asturias Central Area 2011 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 


The two maps must be raster and have the same projection. 
Although the tool works with raster maps at different extents 
and with different thematic resolutions, we recommend that 
raster maps with the same or very similar extents and the- 
matic resolutions be compared so as to avoid results that 
may be not fully informative. For a proper validation, the 
reference map must be for the same year as the simulation. 


Execution 


If necessary, install the Processing R provider plugin and 
download the MapCurves raster.rsx R script into the R 


Q Map Curves Raster 


Parameters ^ Log 
Land use map 1 
=" Simulation C11 [EPSG:32630] 
Land use map 2 
i" C11 [EPSG:32630] 
GOF 
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scripts folder (processing/rscripts). For more details, see 
Chap. “About This Book”. 


Step 1 


Open the Map Curves Raster function and fill in the required 
parameters: “Land Use map 1” (CORINE simulation) and 
“Land Use map 2” (CORINE reference map) (Fig. 6). 


Results and Comments 


After running the tool, a GOF value was obtained for the 
whole maps compared and broken down per pair of classes 
(GOF table). The GOF values are stored in different tables and 
displayed in the “Log” window ($GOF, $GOFtable). The GOF 
values per pair of classes are also represented in the Map 
Curves graph, which is stored in the specified folder (R Plots). 

The GOF value for our comparison is very high (0.92). 
This is logical given that most of the simulated landscape did 
not change over the simulation period and, therefore, 
remained the same. Permanence is one of the easiest pro- 
cesses to simulate in LUC modelling. This means that the 
reference and the simulated maps look very similar. The 
patterns of the two maps are very similar because most of the 
pattern remains unchanged over the simulation period and 
was correctly simulated as such. 

The agreement (GOF) per category was always very high. 
The minimum scores were for port areas (0.669) and mineral 
extraction sites (0.708). In the modelling exercise, these 
categories were treated as features (categories that remained 


C:/Users/David/Dropbox/Manual_QGIS/Map_Curves/Sim_GOF _value.csv 


[V] Open output file after running algorithm 
GOF Table 


C:/Users/David/Dropbox/Manual, QGIS/Map. Curves/Sim GOF. table values.csv 


[V] Open output file after running algorithm 
R Plots 


C:/Users/David/Dropbox/Manual QGIS/Map, Curves/Sim GOF. curve graph.html 


R Console Output 


C:/Users/David/Dropbox/Manual, QGIS/Map  Curves/Sim Full results Map Curves.html 


Fig. 6 Exercise 2. Step 1. Map Curves Raster R script 
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invariant during the simulation) and were therefore not 
simulated. However, a few changes did in fact occur in these 
categories in the reference map. As a result, the Map Curves 
analysis produced a relatively poor fit for these categories 
when comparing the simulation with the reference 
map. Whereas no change occurred in these categories in the 
simulation, a few changes did take place in the reference 
map. Given that these categories consist of a very small 
number of patches, even a small number of changes can 
reduce the GOF values substantially. 

All in all, this analysis is not particularly meaningful. It 
confirms that the two compared maps have very similar 
patterns because most of the landscape was correctly simu- 
lated as permanence. However, more meaningful results 
could be obtained by focusing exclusively on the areas that 
were simulated as change. Hence, for a proper validation of 
the simulation, the simulated changes must be compared 
with the changes observed on the reference maps. 


Exercise 3. To validate simulated changes against a 
reference map of changes 


Aim 


To evaluate how similar the changes we simulated in our 
modelling exercise are to those observed on the reference map. 


Materials 


CORINE Land Use Changes Asturias Central Area 2005- 
2011 
Simulated CORINE changes Asturias Central Area 2005- 
2011 


Requisites 


The two maps must be raster and have the same projection. 
Although the tool does work with raster maps at different 
extents and with different thematic resolutions, we recom- 
mend comparing rasters with the same or very similar 
extents and thematic resolutions, so as to avoid results that 
may not be very meaningful. For a proper validation, the 
simulation and the reference map must refer to the same time 
period. In both cases, the maps must only display the 
changes that occurred during the study period, showing all 
other areas as 0 or some other suitable code. 
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Execution 


If necessary, install the Processing R provider plugin and 
download the MapCurves raster.rsx R script into the R 
scripts folder (processing/rscripts). For more details, see 
Chap. “About This Book”. 


Step 1 


Open the Map Curves Raster function and fill in the required 
parameters: "Land Use map 1" (Simulated CORINE chan- 
ges) and “Land Use map 2” (CORINE changes) (Fig. 7). 


Results and Comments 


After running the function, we get the overall GOF ($GOF) 
value, the GOF value per category ($GOFtable) and the Map 
Curves graph (R Plots). In this case, the only results that 
might be useful for interpreting the validity of the simulated 
changes are the results per category. 

The general GOF value is 0.3, but this is artificially high 
due to the almost perfect overlap of class 0 (areas with no 
change) which has a GOF value of 0.993 (Table 1). A high 
level of agreement between areas of permanence is always 
expected, as explained in detail in the previous exercise 
(Exercise 2). In this case, however, we want to assess the 
agreement between simulated changes and reference map 
changes for the two classes that were modelled actively: 
urban fabric and industrial and commercial areas. 

The spatial overlap between these two categories in the 
two maps is very low. The GOF value for urban fabric 
(Category 3 in the maps) is only 0.05. In the case of 
industrial and commercial maps (Category 4) it is even 
lower: 0.039. 

This means that the spatial structure of the simulated 
changes is very different to that of the changes used as a 
reference for the same period. Thus, even though the Map 
Curves analysis for the whole simulation (persistence and 
changes) obtained good results, the simulated changes 
overlap poorly with the changes mapped in the reference 
data. 

We cannot draw final conclusions about the different 
patterns of simulated and reference changes. Even if there is 
no overlap between them, their shape or fragmentation could 
be similar. For a clearer picture of these aspects, other tools, 
such as spatial metrics, must be used (see Chap. "Spatial 
Metrics to Validate Land Use Cover Maps"). 
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Parameters Log 
Land use map 1 
te" Simulated changes CORINE. 05. 11 [EPSG:32630] v 


Land use map 2 
ta" Changes CORINE 05 11 [EPSG:32630] 
GOF 


C:/Users/David/Dropbox/Manual QGIS/Map. Curves/Sim ch GOF value.csv 


V. Open output file after running algorithm 
GOF Table 


& 


C:/Users/David/Dropbox/Manual QGIS/Map. Curves/Sim ch GOF table values.csv € || v 


V. Open output file after running algorithm 
R Plots [optional] 


C:/Users/David/Dropbox/Manual QGIS/Map. Curves/Sim ch GOF curve graph.html &||.. 


R Console Output [optional] 


|C:/Users/David/Dropbox/Manual QGIS/Map. Curves/Sim. ch| Full, results Map. Curves.html q 7S 


Fig. 7 Exercise 3. Step 1. Map Curves Raster R script 


Table 1 Result from Exercise 3 showing the class GOF values between observed and simulation land use 
0 1 2 3 4 5 6 7 8 10 11 
0 0.993 0.001 0.001 0.001 0 0 0 0.001 0.001 0 0 
3 0 0 0 0.05 0 0 0 0 0 0.023 0 
4 0 0 0 0 0.039 0 0 0 0 0 0 
Execution 


Exercise 4. To validate a series of maps with two or 
more time points 


Aim 


To test the consistency of the pattern of land uses in a series 
of LUC maps made up of two different time points. 


Materials 


CORINE Land Use Map Asturias Central Area 2005 v.0 
CORINE Land Use Map Asturias Central Area 2011 


Requisites 
The two maps must be raster and have the same projection. It 


is also recommended that they have similar extents and 
thematic resolutions. 


If necessary, install the Processing R provider plugin and 
download the MapCurves raster.rsx R script into the R 
scripts folder (processing/rscripts). For more details, see 
Chap. “About This Book". 


Step 1 


Open the Map Curves Raster function and fill in the required 
parameters: “Land Use map 1" (CORINE 2005) and “Land 
Use map 2” (CORINE 2011) (Fig. 8). 


Results and Comments 


The results show the level of overall agreement between the 
pair of maps compared ($GOF), the agreement per category 
($GOFtable), the best matches between categories 
($BMC A2B, SBMC B24A) and the Map Curves graph (R 
plots). All results are displayed in the “Log” window and 
stored in the preselected folders. 
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Parameters ^ Log 


Land use map 1 
| w" COS_v00 [EPSG:32630] 
Land use map 2 

a" C11 [EPSG:32630] 
GOF 
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C:/Users/David/Dropbox/Manual_QGIS/Map_Curves/Temp_ser_GOF_value.csv ] m 


[7] Open output file after running algorithm 
GOF Table 


C:/Users/David/Dropbox/Manual, QGIS/Map. Curves/Temp ser GOF table values.csv leal 


[V] Open output file after running algorithm 
R Plots 


R Console Output 


C:/Users/David/Dropbox/Manual, QGIS/Map, Curves/Temp. ser. GOF. curve. graph.html le 


C:/Users/David/Dropbox/Manual_QGIS/Map_Curves/Temp_ser| Full results Map, Curves.html [ss] 


Fig. 8 Exercise 4. Step 1. Map Curves Raster R script 


The overall agreement between our maps is 0.5, which is not 
high. This means that there is only partial overlap between 
the categories in the two maps. In a series of two or more 
Land Use maps, persistence is the norm and one would 
expect almost perfect overlap between the maps for most of 
the landscape. Landscapes must be very dynamic to expe- 
rience changes affecting more than 10% of the study area. 
The Asturias Central Area is not a dynamic landscape of this 
kind. The low GOF score therefore suggests that a lot of the 
differences between the two maps are due to technical 
changes or errors. 

When agreement was assessed at the category level, the 
only very high values were for water bodies (Category 11), 
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Fig. 9 Result from Exercise 4. GOF matrix 


0.000 


with a GOF of 0.961 (Fig. 9), and background (Category 
12), with a GOF of 1. The background is therefore identical 
in the two maps, whereas the water bodies have an almost 
perfect overlap. The small difference between the two maps 
for the water bodies category (0.039) could be due to spu- 
rious or erroneous changes, although real changes in the 
areas covered by water may also have taken place. 

The agricultural areas (0.709), vegetation areas (0.704) 
and airports (0.778) show a high level of agreement between 
the two maps. However, there are still important differences 
between them that cannot be explained solely by the normal 
land use dynamism of the study area, in which only small 
changes usually take place. 


0.001 0.000 0.000 0.001 0.000 0.000 0 


0.000 0.000 0.000 0. 000 0.00 


0. 900 0.000 0.000 0.000 0.000 0.000 


0.000 0.000 0.000 0.000 
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$BMC A2B $BMC B2A 

A B mGOF A B mGOF 

100 0.709 100 0.709 
2141 0.704 2 1 1 0.704 

3 2 2 0.460 3 2 2 0.460 

4 3 3 0.467 4 3 3 0.467 

S 4 4 0.300 S 4 4 0.300 
655 0.183 655 0.183 

7 6 6 0.113 7 6€ 6 0.113 

8 7 7 0.407 8 7 7 0.407 

9 8 8 0.778 9 8 8 0.778 

10 9 9 0.245 10 9 9 0.245 
11 10 10 0.155 11 10 10 0.155 
12 1L 11 0.961 212 II IX 0.961 
19 12 12 1.000 13 12 12 1.000 


Fig. 10 Results from Exercise 4. Tables indicating for each category 
in the reference map the category in the compared map with wich it 
shows the highest agreement. On the right, agreements when using map 
A as the reference. On the left, agrements when using map B as the 
reference 


For all the other categories, agreement is low or very low. 
Nonetheless, there is no evidence of systematic confusion 
between one category on the first map and a different category 
on the second. This is confirmed by the tables showing the 
best matches between categories (Fig. 10) in which the best 
match for each category (i.e. the largest overlap or agreement) 
was always with the same category on the other map. 

The low agreement or overlap between the categories in 
the two maps is also summarized in the Map Curves graph 
(Fig. 11), which shows that only around 40% of the classes 
on the maps obtained a GOF score of over 0.5. This means 
that more than half the categories show poor overlaps, i.e. 
most of the categories are mapped very differently on each 
map. 

All in all, we can conclude that the time series we 
assessed has many errors and uncertainties and is therefore 
affected by many erroneous or spurious changes. These are 
changes that did not really happen on the ground and arose 
due to technical reasons, such as different production 
methods. In a coherent time series of LUC maps, high GOF 
scores of 0.9 or over would be expected. 

The low agreement in our exercise is due to the change in 
the methodology used to produce the Spanish CORINE 
Land Cover maps between 2006 and 2011. The CORINE 
2005 map (v.00) used in this exercise was obtained using 
photointerpretation of satellite imagery. However, from 
2011 onwards the CORINE maps were obtained by gener- 
alizing more detailed Land Use maps (SIOSE). This change 
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Fig. 11 Result from Exercise 4. Map Curves graph 


in the production method resulted in LUC maps with 
important differences from their predecessors. In order to 
solve this problem, the Copernicus service produced another 
CORINE map for 2005 in Spain according to the new 
methodology, which was consistent and comparable with the 
CORINE 2011 map. This more recent version of the 
CORINE 2005 map is the one normally used in the different 
exercises of this book. 


2 Change on Pattern Borders 
Description 


In pairs of maps or time series, this technique is used to 
identify the changes taking place on the edges of patches. 
The allocation of changes (on the edge of an existing patch 
or a new disconnected one) provides useful information 
about the nature of change dynamics: the expanding or 
shrinking of existing boundaries or the appearance of new 
land use patches. 


Utility 


Exercises 


1. To validate a series of maps with two or more time points 


By detecting the changes taking place on the edges of the 
patches, we can assess both the type of landscape dynamics 
taking place and the data errors resulting from different data 
sources, classifiers or spectral responses. 
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QGIS Exercise 


Available tools 


* Raster 
Raster Calculator 
Conversion 
Polygonize 
* Vector Overlay 
Extract by location 
* Vector Table 
Field Calculator 
* Vector Analysis 
Basic statistics for fields 


For the sake of simplicity, we will only be presenting the tools 
used in this exercise, although we are aware that there are 
many other tools that could be used to carry out this analysis. 


Exercise 1. To validate a series of maps with two or 
more time points 


M. Paegelow and D. García-Álvarez 


Aim 


To focus on gains taking place on the edges of patches for a 
specific land use/cover category. We can then assess the 
proportion of change taking place on the edges of existing 
patches compared to the change that appears in new, dis- 
connected areas. 


Materials 


CORINE Land Cover Map Val d'Ariége 2000 
CORINE Land Cover Map Val d'Ariége 2018 


Requisites 


All maps must be rasters and have the same resolution, 
extent and projection. 


Execution 
Step 1 
First, we extract forests in 2000 (Fig. 12) and then new 


forested locations in 2018 (non-forest in 2000 AND forest in 
2018) using the Raster Calculator (Fig. 13). 


Q Raster Calculator 


| Raster Bands 
CLC 200061 
CLC 201801 
v Operators 
- = sqrt cos 
/ v acos 
< > = l= 
abs min max 
Raster Calculator Expression 


"CLC 200081" = 3| 


Expression valid 


Fig. 12 Exercise 1. Step 1. Raster Calculator 


x 
Result Layer 

Output layer capiil2000 Forest €l 

Output format GeoTIFF M 
Selected Layer Extent 

Xmin | 370207.25231 >) X max | 400818.20296 + 
Ymin | 4733897.908% >| Y max | 4794520.4769: $ 
Columns | 2043 $ Rows | 4046 2 
Output CRS EPSG:32631 -WGS £ v | @ 


V. Add result to project 


sin tan log10 ( 

asin atan In ) 

<= >= AND OR 
OK || Cancel Help 
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Q 
Raster Bands Result Layer 
CLC 200061 Output layer 201112018 forest new2000 €l 
CLC 201801 
Output format GeoTIFF M 
Selected Layer Extent 
Xmin | 370207.25231 $| Xmax |400818.20296 $ 
Ymin | 4733897.90896 > Ymax 4794520.47692 |$ 
Columns | 2043 $| Rows |4046 E 
Output CRS EPSG:32631 - WGS 84/UTM zor ~ | $ 
V Add result to project 
wv Operators 
* 2 sqrt cos sin tan log10 ( 
/ v acos asin atan In ) 
< > = l= <= >= AND OR 
abs min max 
Raster Calculator Expression 
"CLC_2000@1" != 3 AND "CLC_2018@1" = 3 
Expression valid 
OK Cancel Help 


Fig. 13 Exercise 1. Step 1. Raster Calculator 


Figure 14 shows the result as an overlay of the two maps 
obtained: forest in 2000 in light green and forest gains 
between 2000 and 2018 in dark green. 


Step 2 


We then vectorize the binary raster maps computed in Step 1 
using the Polygonize Raster Conversion function with no 
specific parameters. 


Step 3 


We now isolate the forest gains on the edge of the pattern. 
The aim is to distinguish between new areas of forest in 2018 
(i.e. that did not exist in 2000) which are contiguous with 
forests that existed in 2000 and others that are not. For this 
purpose, we use the Extract by location Vector Selection 
tool with the ‘touch’ operator (Fig. 15). 


Figure 16 shows a detail from the resulting layer: the 
forests that existed in 2000 are shown in light green, while 
the new forests that appeared in 2018 separately from 
existing forests are in dark green. The new forests that 
appeared in connection with forests that already existed in 
2000 are overlaid in brown. 


Step 4 


In this step we will isolate the new forests that are not con- 
nected to forests that existed in 2000. This step is optional 
insofar as new forest patches not connected to forests that 
existed in 2000 can be obtained simply by subtracting new 
connected forests from the total area for new forests. 

To get an independent layer of new forest in 2018 that is 
not connected to forests that existed in 2000, we use the 
same Extract by location tool, opting this time for the 'dis- 
joint’ operator (Figs. 17 and 18). 
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Bl] Forest gains 2000-2018 


20 km 


Fig. 14 Exercise 1. Step 1. Intermediate map displaying the overlay of forest areas in 2000 in light green and the overly of forest gains between 
2000 and 2018 in dark green 
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Q Extract by Location 


Parameters | Log 
Extract features from 
(792018 forest new2000 [EPSG:32631] v 
Where the features (geometric predicate) 
intersect V. touch 
contain overlap 


By comparing to the features from 
(79 2000. Forest [EPSG:32631] M 
Selected features only 
Extracted (location) 
/8D  Ariege UTM/2018 newforest, contiguousforest2000.shp 
V| Open output file after running algorithm 


Fig. 15 Exercise 1. Step 3. Extract by Location tool 
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Step 5 


The next step is to calculate the area covered by new 
connected/unconnected forests. We use the Vector table 
Field Calculator tool to create a new attribute called area ha 
(decimal number), selecting the $area operator, divided by 
10,000 to calculate the area in ha (Fig. 19). 

This operation is carried out for both connected and 
isolated forests. The updated attribute tables are shown in 
Fig. 20: table for connected new forests on the left, and for 
unconnected new forests on the right. 


Step 6 


Of the various tools available to summarize the character- 
istics of the assessed patches, we use the Basic statistics for 
fields vector analysis tool. On the left of Fig. 21 we can see 
the various parameters that must be filled in, and on the right 
the log containing the sum of the areas of unconnected new 
patches of forest. 


E Forest 2000 
GB New forest 2018 


GB New forest 2018 connected 
with Forest 2000 


1,000 m 


Fig. 16 Exercise 1. Step 3. An examople area of the resulting raster layer 
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Q Extract by Location 


Parameters | Log 
Extract features from 


Selected features only 


intersect touch 

contain overlap 
V disjoint are within 

equal cross 


OP nenforest20 18 §P56:52650) 


Where the features (geometric predicate) 


By comparing to the features from 


MES A m 


Selected features only 
Extracted (location) 


0% 


E Wm 


C:/B0, Ariege. UTM/newForest2018 isolated. forest2000.shp « ||... 
Vv Open output file after running algorithm 


Cancel 


(Run as Batch Process... 


Fig. 17 Exercise 1. Step 4. Extract by Location tool 
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Results and Comments 


The results consist of update attribute tables and statistics, 
which appear in the log for the Basic Statistics for Fields 
function. After examining the attribute tables, we found that 
there were 74 contiguous and 2 isolated polygons repre- 
senting new forests that did not appear on the map for the 
year 2000. Table 2 summarizes the basic statistics for both 
connected and unconnected new forest patches. 

As can be seen in Table 2, almost all new forest patches 
(97.4%) are connected to forests that existed in 2000. These 
patches cover 92.94% of the total area of new forest. In 
addition, to better interpret these results, we have to bear in 
mind that most of the analysed territory is covered by forest; 
there are too few isolated patches of new forest to allow us to 
come to general conclusions; and changes take place more 
frequently on the edges of existing patches, especially for 
semi-natural dynamics like reforestation, than in new, sep- 
arate areas of the landscape. 


= 
A 


L—] Forest 2000 
Ill New forest 2018 


E] New forest 2018 isolated 
from Forest 2000 


6km 


Fig. 18 Exercise 1. Step 4. An example area from the resulting raster layer showing not connected features 
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Q newfForest2018 contiguous forest2000 — Field Calculator 


[..] Only update 0 selected features 

V! Create a new field 

| Create virtual field 

Output field name [area ha | 
Output field type | Decimal number (real) -| 
Output field length | 10 |$] Precision |3 i 


| Expression | Function Editor | 
DJS) BEA 


| $area /10000 


4-"---------7 


BHEB«RBOTRES 
Feature |1 v | r3 
Preview: 0.022460063642282674 


Fig. 19 Exercise 1. Step 5. Vector Table Field Calculator 


Q 2018utm newforestcontiguous :: Features Total: 74, Filtered: 74, S 


uuec € *S$BuUOYT 
DN area ha 

1| 1 0,022 

2 1 13,133 

3 1 44518 

1 19,195 
Fig. 20 Exercise 1. Step 5. Updated attribute tables 

Q Basic Statistics for Fields 

eene | oy | 
basis: : 

| 2018utm _pewforestisolate (EPS6:32631] MIAE 

Selected features only 

Field to calculate statistics on 

Im - 
aes " 

[[save to temporary fie] | m 


Fig. 21 Exercise 1. Step 6. Basic Statistics for Fields tool 


... Update existing field 


| function Sarea 


Returns the area of the current feature. The 
area calculated by this function respects 
both the current project's ellipsoid setting 
and area unit settings. For example, if an 
ellipsoid has been set for the project then 


no ellipsoid is set then the calculated area 
will be planimetric. 


* $area — 42 


the calculated area will be ellipsoidal, and if 


Q 2018utm newforestisolate :: Features Total: 2, 


2B a5 
|o meh OOO 
46,247 
32,216 


| Execution completed in 0. 12 seconds 
Results: 


(cour: 2, 
"CV": 0.1799231355€95€909, 
"EMPTY': 0, 
"FILLED': 2, 
'"FIRSTIQUARTILE': 32.21€, 
"IQR': 14.030955595555555, 
"MAJORITY": 32.21€, 
MAX: 46.247, 
| 'MEAN': 35.2315, 
"MEDIAN': 39.2318, 
‘MIN’: 32.216€, 
'"MINORITY': 32.21€, 
"OUTPUT HIML FILE': 'C:\\Users\\paege\\AppData\\Local\ 
| Merp/processing 1e20€d02a26140b28601122521b8aclc/ 


| 92b7e1043bfc4Se5a605325d140520224/OUTPUT HIML FILZ.html', 


"RANGE: 14.030$55995955555, 


7.0154999955999$55, 


| "THIRDQUARTILE': 46.247, 
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Table 2 Results from Exercise 1. Spatial metrics for both, connected and not connected new forest patches 


Total new patches of forest 


Number of patches 76 
Minimum area (ha) 0.02 
Maximum area (ha) 246.22 
Mean area (ha) 14.63 
Median area (ha) 5.61 
Area standard deviation (ha) 31.34 
Total area (ha) 1,112.00 


3 Allocation Error Distance 
Description 


Allocation error distance refers to the distance between a 
wrongly allocated pixel compared to the closest object 
belonging to the same category on the reference map. It can 
be measured in different ways: 


(a) The minimum distance from the edge of the wrongly 
allocated patch to the edge of the closest patch 
belonging to the same category on the reference map. 

(b) The distance between the centroids of the two patches 
described in (a). 


Allocation distance error can be expressed in terms of 
(i) individual pixels/patches, (ii) LUC classes (mean dis- 
tance) or (iii) the mean distance for all the allocation errors. 
The mean allocation distance error can be usefully com- 
pleted by calculating the minimum, maximum and standard 
deviation values when applied to several patterns (LUC class 
or whole map). 


Utility 


Exercises 


1. To validate a simulation against a reference map (vector) 
2. To validate a simulation against a reference map (raster) 


Simulation accuracy can be measured in different ways, such 
as quantity agreement, allocation agreement, landscape 
structure agreement, etc. (Hagen-Zanker 2006; Paegelow 
et al. 2014) as described in Part III of this book. Generally, 
the indices and maps assessing allocation error tend to focus 
on the amount involved. Here we go further by measuring 
“how wrong" the simulation errors are. This analysis, which 
measures the individual (entity) or mean error distance (LUC 
class), is complementary to the cross-tabulation of maps 


Connected patches Unconnected patches 


74 2 
0.02 32.22 
246.22 46.25 
13.97 39.23 
4.40 39.23 
31.48 7.02 

1,033.54 78.46 


at varying spatial resolution, often implemented by fuzzy 
logic. 


QGIS Exercises 


Available tools 


* Raster 
Raster Calculator 
* Raster 
Analysis 
Proximity 
* Processing Toolbox 
GRASS 
r.distance 
r.grow.distance 
* Processing Toolbox 
SAGA 
Distance 


GRASS and SAGA toolboxes offer several algorithms for 
measuring the distance inside a raster grid (r.grow.distance; 
SAGA distance) or the minimum distance between 
pixels/patches belonging to two different grid layers (r.dis- 
tance). Their use inside QGIS may be unstable. 

Vector analysis tools require converting raster layers into 
vector format and then calculate the centroids of the poly- 
gons obtained. The Distance to nearest centre (points) tool 
creates a points layer whose table contains minimum dis- 
tances between the points in one layer to the nearest point in 
the second layer. 

Both tools (raster and vector) are used in the next two 
exercises because they provide complementary results. 


Exercise 1. To validate a simulation against a ref- 
erence map (vector) 


Advanced Pattern Analysis to Validate Land Use Cover Maps 


Aim 


To calculate the seriousness (degree) of allocation errors for 
a specific LUC category, expressed as the minimum mean 
distance between all the pixels wrongly allocated to this 
category in the simulation and the nearest patch belonging to 
the same category on the reference map. 


Materials 


CORINE Land Cover Map Val d'Ariége 2018 
Simulation LCM Val d'Ariége 2018 


Requisites 


Maps can be raster or vector. They must have the same 
resolution, extent and projection. If using vector maps, 
readers can skip the first steps detailed in the execution. 


Execution 
Step 1 


We extract real built-up areas in 2018 (Fig. 22) and the 
pixels wrongly allocated as built-up areas with the Raster 
Calculator (Fig. 23). They are areas wrongly simulated as 
built-up areas, which are not built up according to the ref- 
erence map. 


G Raster Calculator x 
| 
| Raster Bands Result Layer 
| CLC 201881 Output layer 011113112018. buit-up 
CLC. predict, 201881 
Output format GeoTIFF M 


Selected Layer Extent 


X mn 370207,25231 X max 400818,20296 


Ymn 4733897,90896 > Y max 4794520,47692 > 
Columns 2043 > Rows 4046 E 
Output CRS 


V. Add resu to project 


EPSG:32631 - WGS 84 / + 


v Operators 
* sqt cos sn tan log10 ( 
/ + acos asn atan h ) 
< > H I= <= >= AND OR 
abs mn max 
Raster Calculator Expression 
Lc 201881 | 
Expression vaid 
OK Cancel Heb 


Fig. 22 Exercise 1. Step 1. Raster Calculator 
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Raster Calculator x 


Raster Bands 


Xmax 600 


Ymax 62 


Columns 2039 


Rows 4044 
Output CRS E956:2154 - RGF93 | Lambert-93 


V Add rendt to project 


Expression valid 


Fig. 23 Exercise 1. Step 1. Raster Calculator 


The right map (A) in Fig. 24 is an overlay of real built-up 
areas (light grey) in 2018 (Corine Land Cover) and areas 
wrongly simulated as built-up (black). The left map in 
Fig. 24 represents the allocation errors that we will now go 
on to analyse. 


Step 2 


The two raster layers obtained in Step 1 are now polygo- 
nized into vector layers. This is done using the Polygonize 
function in the Raster—Conversion menu (Fig. 25). 

The above map (Fig. 26) shows an overlay of the two 
vector layers: real built-up polygons in 2018 (reference map) 
and areas wrongly allocated as built-up (red) by the simu- 
lation. Results vary depending on whether or not diagonal 
connexions are allowed. 


Step 3 


We then calculate the centroids for each of these vector 
layers with the Centroids tool (Vector—Geometric tools) 
(Fig. 27). 


Step 4 


Once we have obtained the two centroids maps (built-up 
areas in 2018 and built-up allocation errors), we use the 
Distance to nearest hub (points) tool available in the Pro- 
cessing Toolbox (QGIS Vector). The source points 
layer is the point layer containing allocation errors and the 
destination hubs layer is the layer containing the built-up 
centroids from the reference map (Fig. 28). We measure 
the distance in metres and give the output point layer a 
name. 
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Fig.24 Exercise 1. Step 1. Intermediate map showing the built-up areas correctly allocated in light gray (A) and the wrongly simulated built-up 
areas in black (B) 


Q Polygonize (Raster to Vector) 


Parameters | Log 
Input layer 
WW" CLC, 2018, UTM buitup [EPSG:32631] M 
Band number 
Band 1 (Gray) 
Name of the field to create 
DN 
V Use 8-connectedness 
b Advanced parameters 
Vectorized 
C:/Users/paege/Desktop/_D/WORDWIN/projets en cours/Incertitumbre2019/Urba Ariege/BD Ariege UTM/CLC 2018 UTM builtup.shp 
V Open output file after running algorithm 


GDAL/OGR console call 


python3 -m gdal  polygonize "C:/Users/paege/Desktop/_D/WORDWIN/projets en cours/Incertitumbre20 19/Urba Ariege/BD Ariege UTM/ 
CLC, 2018 UTM builtup.tif" "C:/Users/paege/Desktop/ D/WORDWIN/projets en cours/Incertitumbre20 19/Urba Ariege/BD Ariege UTM/ 
CLC 2018 UTM builtup.shp" -8 -b 1 -f "ESRI Shapefile" CLC 2018 UTM buitup DN 


Fig. 25 Exercise 1. Step 2. Polygonize (Raster to Vector) 
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Fig. 26 Exercise 1. Step 2. Intermediate map showing built-up areas 
correctly simulated in cyan and wrongly allocated built-up areas in red 


Q Centroids 
Parameters | Log | 
Input layer 
© bul upcomerroru [EPSG:32631] 
Selected features only 
Create centroid for each part 
Centroids 


Q Distance to Nearest Hub (Points) 


Parameters | Log 

Source points layer 

1 ` built-upcomerrorutm point [EPSG:32631] M 
Selected features only 

Destination hubs layer 


Selected features only 
Hub layer name attribute 
123 DN 
Measurement unit 

Meters 

Hub distance 


7” dc, 2018, UTM builtup point [EPSG:32631] SA || | 
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ojets en cours/Incertitumbre2019/Urba Ariege/BD Ariege UTMdst hub utm.sh |... | 


V Open output file after running algorithm 


Fig. 28 Exercise 1. Step 4. Distance to Nearest Hub (Points) 


Centroids 


| This algorithm creates a new point layer, with 
|| | points representing the centroid of the geometries 
| in an input layer. 


The attributes associated to each point in the 
E€ output layer are the same ones assocated to the 
original features. 


s/Incertitumbre2019/Urba Ariege/ED, Ariege. UTM/built-upcomerrorutm point.shp | | ... | | 


V. Open output file after running algorithm 


Fig. 27  Exercuse 1. Step 3. Centroids 
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Step 5 


To obtain the desired statistics about the allocation error 
distance for wrongly simulated built-up areas, we use the 
Basic statistics for fields tool (Processing Toolbox, Vector— 
analysis) by selecting the field containing the calculated 
distance to the nearest hub (Fig. 29). 


Q Basic Statistics for Fields 


Parameters | Log 


Input layer 
dist hub. utm [EPSG:32631] Y ə 
Field to calculate statistics on 
1.2 HubDist A 
Statistics 


Fig. 29 Exercise 1. Step 5. Basic Statistics for Fields 


Results and Comments 


The resulting points layer contains the same number of 
points as the allocation error polygons at the same location. 
The corresponding table contains the minimum distance 
between each allocation error (centroid) and the nearest 
existing built-up area (centroid) on the reference map 
(Fig. 30). 


Q disterror_centroid_utm :: Total des entités: 132 


J mS e 
DN HubName HubDist 
1 10 4052,723485740... 
2 10 3506, 743550374... 
3 11 436,9969389902... 
4 11 2646,064842092... 
5 10 11715,63198073... 


Fig. 30 Result from Exercise 1. Attribute table indicating (HubDist) 
the minimum distance between each allocation error and the nearest 
built-up area 
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A summary of the statistics appears in the log of the Basic 
statistics for fields function (Fig. 31). 

As we can see, the mean distance for 132 allocation errors 
is about 1,236 m. This is quite close to the median value 
(1,119 m), although standard deviation is also quite high 
(775 m). When interpreting these values, it is important to 
remember how the distance was calculated: from centroids 
offering a one-dimensional representation of the built-up 
areas (polygons). If we had measured the distance from the 
nearest edge to the nearest edge, the values would have been 
lower. 

The mean allocation error distance of about 1.2 km 
should be put into context by comparing it with the spatial 
extent of the layer, which is about 31 x 62 km. It may also 
be useful to compare this value with the mean allocation 
error distances for other LUC categories and the mean value 
for all the allocation errors. 


Exercise 2. To validate a simulation against a 
reference map (raster) 


Aim 


To calculate the seriousness (degree) of allocation errors for 
a specific LUC category expressed as the minimum, indi- 
vidual and mean distance between wrongly allocated areas 
(simulation map) and the nearest patch belonging to the 
same LUC category on the reference map. 


Materials 

CORINE Land Cover Map Val d’Ariége 2018 

Simulation LCM Val d'Ariége 2018 

Built-up allocation error map (generated during Exercise 1) 


Requisites 


All maps must be rasters and have the same resolution, 
extent and projection. 


Execution 
Step 1 
First, we compute a raster distance map up from built-up 


areas using the QGIS raster function Proximity (Fig. 32). If 
the built-up areas layer is not available, it must be extracted 
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Execution completed in 0.12 seconds 
Results: 

('COUNT': 132, 

'CV': 0.€2723721302285321, 
'EMPTY': O, 


| 'FILLED': 132, 


'FIRSTQUARTILE': €08.319148071182€, 

'IQR': 1188.9517389924517, 

'MAJORITY': 35.285750182722104, 

'MAX': 4031.2025628251654, 

'MEAN': 1235.8905702081€54, 

'MEDIAN': 1118.727632291€427, 

'MIN': 35.285750182722104, 

'MINORITY': 35.285750182722104, 

'OUTPUT HIML FILE': 'C:\\Users\\paege\\AppData\\Local\ 
\Temp/processing le20€d02af€140b28€011f29f1b8aclc/ 
481€505S5bdldácdfa74fb€da4€52£7aS9/OUTPUT HTML FILE.html', 
'RANGE': 3995.91€812€424433, 


251 


'STD DEV': 775.19€55€8581977, 


'SUM': 1€3137.5552€747784, 


'THIRDQUARTILE': 1797.2708870€3€343, 


| 'UNIQUE': 132) 


Fig. 31 Result from Exercise 1. Log window from Basic Statistics for Fields tool 


Q Proximity (Raster Distance) 


Parameters | Log 
Input layer 

WW" CLC. 2018 UTM buiitup [EPSG:32631] 
Band number 

Band 1 (Gray) 


A list of pixel values in the source image to be considered target pixels [optional] 


1 

Distance units 
Georeferenced coordnates 

The maximum distance to be generated [optional] 
0,000000 


Value to be applied to all pixels that are within the -maxdist of target pixels [optional] 


0,000000 

Nodata value to use for the destination proximity raster [optional] 
0,000000 

b Advanced parameters 
Proximity map 


C:/Users/paege/Desktop/ DAWORDWIN/projets en cours/Incertitumbre2019/Urba Ariege/BD Ariege UTM/CLC 2018 UTM buitupdst.tif 


Fig. 32 Exercise 2. Step 1. Proximity (Raster Distance) 


from the CLC. 2018 layer using Raster Calculator (see Step 
1 of the previous exercise). 

In the Proximity tool, the input layer is built-up areas in 
2018. We have to specify the target pixels (allocation 
errors = 1) and the fact that we want to calculate the distance in 


Coordinate Reference System (CRS) units (Fig. 32). The result 
is shown in Fig. 33. This map illustrates the distance 
between areas wrongly allocated to built-up (red) in the sim- 
ulation and real built-up areas on the reference map (mapped in 


grey). 
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Real built-up 
Ill VWrongly allocted built-up 
Distance (m) 
o 
C 15,986.7 


Fig. 33 Exercise 2. Step 1. Distance map between wrongly simulated and real built-up areas 


Step 2 


Once you have obtained a distance map and an allocation 
error map in vector format (obtained in the previous exer- 
cise, Step 2), the next step involves extracting statistics from 
the raster distance map in order to update the table for the 
polygon (vector) layer of allocation errors using the Zonal 
statistics tool (Processing toolbox) (Fig. 34). 

Open this function and choose the distance map to 
built-up areas 2018 (reference map) and the vector layer 
containing the allocation errors for the built-up category in 
2018 (simulation). The table (Fig. 36) for the vector layer 


will be enhanced by one or more additional columns 
depending on the number of statistics selected. In this case, 
the following values were measured: minimum, mean, 
median, standard deviation and maximum (Fig. 35). Fig- 
ure 36 shows the updated table. 


Step 3 


The third and last step can be done on a spreadsheet. We will 
calculate the mean values (mean, median, standard devia- 
tion, minimum and maximum) for the individual distances 
extracted (Table 3). 
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Wwe meciassiTy Dy layer 
3 Reclassify by table 
3 Sample raster values 
3* Zonal histogram 


> Q Rasterterrain analysis 
> Q Rastertools 
> Q Vector analysis 


Zonal statistics 


Algorithm ID: 'ggis:zonalstatistics' 


Fig. 34 Exercise 2. Step 2. Zonal Statistics 


Q Zonal Statistics X 


Parameters IER 

| Raster layer 

| B a.c 2018 UTM. bultupdst [EPSG:32631] 
| Raster band 

|| Band 1 Gray) 

Vector layer containing zones 

CP buit-upcomerrorutm [FPSG:32631] — 


* Zonal statistics 


This algorithm calculates statistics of a raster layer 
for each feature of an overlapping polygon vector 


Fig. 35 Exercise 2. Step 2. Zonal Statistics 


Q built-upcomerrorutm :: Features Total: 150, Filtered: 150, Selected: 0 


SEAR TER? HP 


r% = 


LM 


- 
om 
Cz 


1. mean 


1. median 


T stdev 


1. min 


1. max 


71,33016204833... 
37,23363629509... 
58,41231830596... 
21,56998003446... 
82,94153703754... 
39,63797190491... 
14,98337658053... 


14,98333358764... 


Fig. 36 Exercise 2. Step 2. Updated attribute table 


74,91667175292... 
33,50375366210... 
54,02317810058... 
21,18963241577... 
74,91667175292... 
29,96666717529... 
18,08648300170... 
14,98333358764... 


34,18298759754... 
20,49935032771... 
34,33703505354... 
7,267292783947... 
55,17205590117... 
23,99610053437... 
4,388485586978... 

0 


14,98333358764... 
14,98333358764... 
14,98333358764... 
14,98333358764... 
1498333358764... 
1498333358764... 
14,98333358764... 
14,98333358764... 


134,8500061035... 
91,14006042480... 
156,4306030273... 
33,50375366210... 
239,7333374023... 
89,90000152587... 
21,18963241577... 
14,98333358764... 
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Table 3 Results from Exercise 2. Calculated statistics 


median 


53.11 


. mean 


Mean 56.96 


Results and Comments 


As we can see, the mean minimum distance for built-up 
commission errors is about 28.5 m. The mean distance is 
close to 57 m. The mean maximum distance is quite small 
(106.81) and the standard deviation is low (21.98). This 
means that allocation errors affect small patches or are close 
to the right location. 

The values obtained in this exercise differ greatly from 
those obtained in Exercise 1. During Exercise 1 we calcu- 
lated the distances between the centroids of polygons. This 
may result in longer distances than those generated by the 
technique used in Exercise 2, which measures the mean or 
minimum distance. The two techniques can produce differ- 
ent results, depending on the number, the extent and the 
shape of the features being analysed. 
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stdev min max 
21.98 28.54 106.81 
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Abstract 


One of the most commonly used techniques for validating 
Land Use Cover (LUC) maps are the accuracy assessment 
statistics derived from the cross-tabulation matrix. How- 
ever, although these accuracy metrics are applied to 
spatial data, this does not mean that they produce spatial 
results. The overall, user's and producer's accuracy 
metrics provide global information for the entire area 
analysed, but shed no light on possible variations in 
accuracy at different points within this area, a shortcom- 
ing that has been widely criticized. To address this issue, 
a series of techniques have been developed to integrate a 
spatial component into these accuracy assessment statis- 
tics for the analysis and validation of LUC maps. 
Geographically Weighted Regression (GWR) is a local 
technique for estimating the relationship between a 
dependent variable with respect to one or more indepen- 
dent variables or explanatory factors. However, unlike 
traditional regression techniques, it considers the distance 
between data points when estimating the coefficients of 
the regression points using a moving window. Hence, it 
assumes that geographic data are non-stationary i.e., they 
vary over space. Geographically weighted methods 
provide a non-stationary analysis, which can reveal the 
spatial relationships between reference data obtained from 
a LUC map and classified data. Specifically, logistic 
GWR is used in this chapter to estimate the accuracy of 
each LUC data point, so allowing us to observe the spatial 
variation in overall, user's and producer's accuracies. 
A specific tool (Local accuracy assessment statistics) was 
specially developed for this practical exercise, aimed at 
validating a Land Use Cover map. The Marqués de 
Comillas region was selected as the study area for 
implementing this tool and demonstrating its 
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applicability. For the calculation of the user's and 
producer's accuracy metrics, we selected the tropical rain 
forest category [50] as an example. Furthermore, a series 
of maps were obtained by interpolating the results of the 
tool, so enabling a visual interpretation and a description 
of the spatial distribution of error and accuracy. 


Keywords 


Geographically Weighted Regression * Overall accuracy 
* User's accuracy * Producer's accuracy 


1 Overall, User's and Producer's Accuracy 
Through GWR 


Description 


Overall accuracy (OA), user's accuracy (UA) and producer's 
accuracy (PA) are assessment metrics obtained from the 
cross-tabulation matrix (see Sect. 5 in chapter “Metrics Based on 
a Cross-Tabulation Matrix to Validate Land Use Cover Maps"). 
Overall accuracy is expressed as the proportion of the map 
that has been correctly classified. User's accuracy indicates the 
probability that a pixel from a specific category on the classified 
map correctly represents the real situation on the ground or 
reference map. Producer's accuracy indicates the probability that 
a reference pixel belonging to a specific category has been cor- 
rectly allocated to that category (Story and Congalton 1986). 
These last two metrics (user's and producer's accuracies) refer to 
commission and omission errors, respectively. 

None of these accuracy assessment statistics produces 
spatially distributed information, i.e., they provide a single 
accuracy value for the entire study area or for each land 
use/land cover class. However, it is possible to explore how 
the error and accuracy of a classified map is spatially dis- 
tributed with respect to reference data using Geographically 
Weighted Regression (GWR) methods. 
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GWR allow us to explore local spatial relationships 
between a dependent variable and a set of explanatory 
variables (Brunsdon et al. 1996; Fotheringham et al. 2002). 
In this chapter, we use the logistics version of the geo- 
graphical weighting method (GWLR) to generate land 
use/land cover accuracy metrics with spatial variation, 
according to the proposal by Comber (2013), which was 
later developed in Comber et al. (2012), Comber et al. 
(2017) and Tsutsumida and Comber (2015). 

GWR is a statistical technique in which regression points 
are estimated on the basis of the spatial distribution of data 
points. A moving window analyses the data points it collects to 
estimate the coefficients of the selected regression point. This 
window, or kernel, weights each data point according to the 
distance within the window and the assigned weighting 
function (gaussian, exponential, bisquare, tricube, boxcar). 
Its maximum weighting value is 1 and this decreases as the 
distance between the observation and calibration data points 
increases. The size of the kernel is defined by the bandwidth, 
which indicates the number of data points that will be included 
in the local calculation for each regression point. This can 
consider either a fixed or a variable number of reference data 
points. If a fixed number of points are considered, a specific 
number will be obtained, while in the case of a variable 
number, a distance value is given. The number of reference 
data points therefore varies according to their distribution. It is 
important to select a suitable bandwidth so as to minimise the 
cross-validation prediction error. According to Fotheringham 
et al. (2002), the GWR formula is 


yi — Potu,.v;) F 3c. Batu 
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where f), is the intercept, f), is the coefficient, x, is the value 
of the explanatory variable, and u;, v; are the coordinates of 
the data point (Fig. 1). 

This geographically weighted method was adapted for the 
calculation of local accuracy assessment statistics by Com- 
ber (2013). According to his proposal, the probability that a 
reference data point is correctly identified by a classified data 
point is given by 


Overall accuracy + P(A = 1) = logit(Bo,,,,,)) 


where P(A = 1) is the probability that the agreement 
between the classified data and the reference data is equal 
to 1. This value is 0 when there is no agreement and 1 when 
there is agreement. 

To estimate user’s accuracy, it is necessary to analyse the 
reference data against the classified data. This metric indi- 
cates the probability that the reference LUC class y, and is 
correctly predicted by the classified data x;. 


User's accuracy — P(y; = 1) 
2 logit (Potu F fi axi) $ 


To estimate producer’s accuracy, it is necessary to anal- 
yse the classified data against reference data. This indicates 
the probability that the classified data x; correctly represents 
reference LUC class y;. 


Producer's accuracy — P(x; = 1) 


= logit (Botun T Bitum} 


jj 
e data point 


bandwidth 


d 
X regression point w; is the weight of data point j at regression point / 


ij 


d; is the distance between regression point į and data point j 


Fig. 1 Spatial kernel. Regression point, data points and bandwidth are observed. The curve represents the Gaussian function that establishes the 
weighting of the data points for the regression point. Retrieved from Fotheringham et al. (2002) 
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Finally, in order to obtain the accuracy values, the coef- 
ficients have to be adjusted. To this end, the coefficients are 
added together, and an alogit function (inverse logit) is 
applied. 


Utility 


Exercises 


1. To validate a map against reference data/map 


Geographically Weighted methods can be used to validate 
single LUC maps by analysing spatial variations in the 
agreement between reference data and classified remotely 
sensed data, so enabling us to analyse the spatial 
non-stationarity of LUC data error and accuracy. They allow 
to explore the spatial relationships between the reference 
data and the classified data, exposing possible clusters of 
land cover errors, and reporting the values for each data 
point in contrast to global accuracy assessment statistics, 
which only provide a global value for the entire map. 

This technique allows us not only to discover what pro- 
portion of the map has been correctly classified but also to 
estimate in which areas the classification fits best and to 
analyse possible trends that are only visible spatially. In this 
way, the spatial distribution of the overall, user's and pro- 
ducer's accuracy metrics can be visualized on a map so as to 
enable a better understanding of classification uncertainty. 


QGIS Exercise 


Available tools 


= Processing Toolbox 
R 
Geographically weighted methods 
Local accuracy assessment statistics 
Interpolation 
IDW Interpolation 
GDAL 
Raster extraction 
Clip raster by mask layer 


By default, there are no tools in QGIS that carry out a 
Geographically Weighted Methods analysis to estimate 
overall, user's and producer's accuracy values for local 
areas. We have therefore developed an R tool to calculate 
these local accuracy assessment metrics in QGIS, in which 
Geographically Weighted Methods are already implemented. 

The Local accuracy assessment statistics script is based 
on the code developed by Professor Alexis J. Comber from 
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the University of Leicester,' which was created using above 
all the “spgwr” R package.” The script provides overall, 
user’s and producer’s accuracy values for each data point, so 
allowing accuracy and error distribution areas to be gener- 
ated by interpolation of the results obtained by the tool. 

First, to estimate local OA values, the tool calculates 
internally, for each data point, the agreement between the 
reference data and the classified data, where 0 represents 
disagreement and 1 represents agreement. Agreement is 
automatically selected as dependent variable [y] and “1” is 
selected as independent variable [x], where P(A = 1) is the 
probability that agreement is equal to 1. 

To estimate local UA values, the tool generates a new data 
frame and obtains two columns. One column shows the 
presence (1)/absence (0) of the chosen category for the 
reference data, while the other column shows the same for 
the classified data. The reference data (RD) is selected 
as dependent variable [y], and the classified data (CD) is 
selected as independent variable [x], where P(RD = 1|CD = 

1). The procedure for producer's accuracy is very similar. 
The classified data for the chosen category is selected as 
dependent variable [y], and the reference data is selected as 
independent variable [x], where P(CD = 1|RD = 1). 

In order to ensure that the tool works correctly, various 
parameters must be configured. Selecting an appropriate band- 
width is therefore crucial. A small bandwidth would include too 
few data points in the local sample, making it unreliable for 
calibrating the model, while a large bandwidth would include 
too many data points, so reducing the local analysis capacity. 
A spatially distributed data sample is also required. 

The fact that the parameters must be configured and the 
need for more in-depth knowledge to interpret the results 
could be considered a disadvantage when choosing these 
validation methods. Another important consideration is that 
using large data samples can lead to long runtimes. 


Exercise 1. To validate a map against reference 
data/map 


Aim 


To assess the spatial variation of accuracy assessment 
measures (overall, user’s and producer’s accuracy) when 


! The code is available at the personal repository of Professor Alexis 
J. Comber. https://github.com/lexcomber/Accuracy Workshop2016. 

? Full details of this R package and the functions it includes, may be 
found at https://cran.r-project.org/web/packages/spgwr/spgwr.pdf. 
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Parameters Log 
Layer 
Or random_sample_points [EPSG:32615] 
Selected features only 
Accuracy 
Overall 
Reference data 
123refer data 
Classified Data 
123 class 


Category [optional] 


Bandwidth 

0.150000 

Output 

D:/gwr. data/results/oa points.shp 

V Open output file after running algorithm 


096 


Run as Batch Process... 


Local accuracy 
assessment statistics 


-| & EN zs 


Algorithm description 


This tool obtains spatial measures of 
accuracy (overall, user and producer), 
and then uses Geographically Weighted 
Logistic Regression to generate spatially 
distributed measures. Based on the 
work of Comber (2013). 


Input parameters 


~ Layer 
The feature class or sample dataset 


containing reference and classified 
data. 


Accuracy 


“> 


Accuracy metrics to be calculated. 


Reference data 


Field containing reference data. 
Classified Data 
Category x 


Run Close 


Fig. 2 Excersice 1. Step 1. Local accuracy assessment statistics (Overall accuracy) 


validating the Marqués de Comillas LUC map against a 
reference set of points. 


Materials 


Marqués de Comillas random sample points from Mexico 
(2019) 
Boundary of Marques de Comillas 


Requisites 


The data points must be projected in their corresponding 
reference system. The vector point file must include two 
attributes, one corresponding to reference LUC data and one 
to classified LUC data. It is recommended that the 
data points have an appropriate random distribution. 


Sample size should not be overly large, as this could lead to 
long runtimes. 


Execution 


If necessary, install the Processing R provider plugin, and 
download the Local accuracy assessment statistics.rsx R 
script into the R scripts folder (processing/ rscripts). For 
more details, see chapter “About This Book" of this book. 


Step 1 


Open the Local accuracy assessment statistics function and 
fill in the required parameters (see Fig. 2). The input for this 
tool is the point layer containing the LUC random sample 
dataset. Select the type of accuracy assessment statistic to be 
obtained (“Overall”), and indicate the corresponding 
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Q Local Accuracy Assessment Statistics 


Parameters Log 
Layer 

7 ' random, sample points [EPSG:32615] 

Selected features only 

Accuracy 

User 
Reference data 

123refer data 
Classified Data 

123 dass 
Category [optional] 

50 
Bandwidth 

0.150000 
Output 

D:/gwr. data/results/50, ua, points.shp 


V Open output file after running algorithm 


Run as Batch Process... 


4 
Local accuracy 
assessment statistics 


-69 XY L- 


Algorithm description 


This tool obtains spatial measures of 
accuracy (overall, user and producer), 
and then uses Geographically Weighted 
Logistic Regression to generate spatially 
distributed measures. Based on the 
work of Comber (2013). 


Input parameters 


~ Layer 
The feature class or sample dataset 


containing reference and classified 
data. 


Accuracy 


4» 


Accuracy metrics to be calculated. 


Reference data 


Field containing reference data. 
Classified Data 
Category v 


Cancel 


Fig. 3 Excersice 1. Step 2. Local accuracy assessment statistics (User's accuracy) 


attribute table columns with the reference data and the 
classified data. The category can also be indicated, although 
this is only used to estimate the user's and producer's 
accuracy values. The remaining value to be set is the 
bandwidth, which in this exercise is 0.15. This means that 
1596 of the nearest neighbours will be used to estimate the 
coefficient for each regression point. The kernel is set 
internally in the tool by default with a Gaussian function. 


Step 2 


The parameter configuration for calculating User's Accuracy 
is very similar. Select the corresponding accuracy assess- 
ment statistic in the "Accuracy" option ("User") and the 
category you want to assess in the "Category" option, (see 


Fig. 3). In this exercise, we will be using the tropical rain 
forest class [50] as an example. 


Step 3 


To estimate the producer's accuracy values, the same steps 
must be followed (see Fig. 4). Select the corresponding 
accuracy assessment statistic ("Producer"), and the tool will 
modify the internal inputs. The tropical rain forest class [50] 
will again be used as an example. 


Step 4 


Finally, the coefficients adjusted by the Local accuracy 
assessment statistics tool were interpolated using the Inverse 
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Q Local Accuracy Assessment Statistics 


Parameters Log 
Layer 
!' random sample points [EPSG:32615] 
Selected features only 
Accuracy 
Producer 
Reference data 
123refer data 
Classified Data 
123 class 
Category [optional] 
50 
Bandwidth 
0.150000 
Output 
D:/gwr. data/results/50 pa points.shp 


V Open output file after running algorithm 


0% 


Run as Batch Process... 


x 


Local accuracy 
assessment statistics 


v. 9 E - - - 

QE Algorithm description 
This tool obtains spatial measures of 
accuracy (overall, user and producer), 
and then uses Geographically Weighted 
Logistic Regression to generate spatially 
distributed measures. Based on the 
work of Comber (2013). 


Input parameters 


v Layer 
The feature class or sample dataset 


containing reference and classified 
data. 


Accuracy 


“> 


Accuracy metrics to be calculated. 


Reference data 


Field containing reference data. 
Classified Data 
Category Y 


Cancel 


Run Close 


Fig. 4 Excersice 1. Step 3. Local accuracy assessment statistics (Producer's accuracy) 


Distance Weighted method (IDW interpolation tool in 
QGIS) (see Fig. 5) to obtain a map showing the continuous 
variation in the spatial distribution of the accuracy measures, 
and to facilitate understanding in a more visual manner. 

The names of the column or attribute obtained as a result 
of applying the tool and indicating the local overall, user’s 
and producer’s accuracy values are “g_ SDF_”, “coefs_u” 
and "coefs p" respectively. This column must be specified 
in the “Interpolation attribute" option in line with the accu- 
racy metric being analysed. 


Step 5 


As an additional, optional step, the raster images obtained by 
interpolation can be clipped by mask using the Marques de 
Comillas boundary (Clip raster by mask layer tool in QGIS) 
in order to provide a better visual representation. In addition, 


a discrete colour scale using six classes was chosen in order 
to make interpretation of the data more straightforward. 


Results and Comments 


After the execution of the previous steps, we obtain a new 
attribute column with the estimated local values for OA, UA 
and PA respectively, and the interpolated distribution maps for 
these accuracy measures. Another output of the tool is a new 
layer that includes the estimated Overall Accuracy value for 
each data point. In addition, a summary of the local and overall 
values calculated is displayed in the log window (Fig. 6). It 
shows the minimum, first quantile, median, mean, third quar- 
tile, maximum and global overall accuracy values (Table 1). 
The IDW interpolation method is used to generate an 
area that visually represents the distribution of the values 
obtained, offering a more detailed spatial representation of 
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Q IDW Interpolation 


Parameters Log 


Input layer(s) 
Vector layer 1" oa points 


Interpolation attribute 1.2 g  SDF. 


Use Z-coordinate for interpolation 


Attribute 
9g. SDF. 


Vector layer 
oa points 


Type 
Points 


Distance coefficient P 
3.000000 
Extent 
718095.6643,780929.1625,1778200.9883,1837254.3530 [EPSG:32615] 


Output raster size 


O 


Rows 592 + Columns |629 


a 


Pixel size X | 100.000000 ~ Pixel size Y | 100.000000 E 


Interpolated 
D:/gwr. data/results/oa idw.tif 


V Open output file after running algorithm 


096 


Run as Batch Process... 


d 


Fig. 5 Excersice 1. Step 4. IDW Interpolation 


the distribution of accuracy and error than that provided by a 
single overall accuracy value. Figure 7 clearly shows a 
higher degree of accuracy in the north of the map, which 
decreases as it moves south and east. 

The example category in this exercise is tropical rain 
forest (code 50). User's accuracy describes the commission 
errors in the tropical rain forest category. Its values range 
between 0.55 and 0.87, with a variation of 0.32, despite the 
overall value for the entire study area of 0.74 (Fig. 8). 

Figure 9 represents the probability that a classified data 
point belonging to the tropical rain forest class is correctly 
represented by the reference data (User's accuracy). Values 


are high through the centre and south of the region, but fall 
as we move away to the northeast. 

The last part of this exercise focuses on Producer's 
Accuracy. In this case, it describes omission errors related to 
the tropical rain forest class. User's accuracy varies from 
0.56 to 0.89 (variation of 0.33), despite the global value for 
the entire area of 0.74 (Fig. 10). 

Figure 11 represents the probability that any reference 
data point is correctly classified (producer's accuracy). Most 
of the omission errors are concentrated in the north-east of 
our study area, while higher levels of producer's accuracy 
can be seen in the south-west. 
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Fig. 6 Results from Exercise 1 displayed in the "output' window of the "Local accuracy assessment statistics’ showing variations in overall 


accuracy 


Table 1 Results from Exercise 1. Table summarizing the variations in Overall, User's and Producer's accuracy values 


Min lst Qu Median Mean 3rd Qu Max Global 
Overall Accuracy (OA) 0.7683 0.7868 0.7962 0.8004 0.8140 0.8412 0.8032 
User's Accuracy (UA)* 0.5564 0.6702 0.7609 0.7432 0.8201 0.8777 0.7403 
Producer’s Accuracy (PA)? 0.5665 0.6424 0.6986 0.7168 0.7979 0.8905 0.7403 


* These values are for the tropical rain forest class [50] 


The values set out in Figs. 6, 8 and 10 are summarized in 
Table 1, which shows the variations in the accuracy of the 
classified data points with respect to the reference data 
points. The Overall accuracy value for the entire study area 
is 0.80. Nonetheless, it has been demonstrated that OA 
varies over space. The minimum value is 0.77 and the 
maximum is 0.84, which means that a variation of 0.07 is 
observed. 

Producer's accuracy has the highest range of variation, 
with User's accuracy close behind. By contrast, Overall 
accuracy has a relatively small range, indicating low levels 


of spatial variation. Despite this, the maximum Overall 
accuracy value (0.84) is below the value proposed by 
Anderson (1971). 

In conclusion, Local accuracy assessment statistics 
should be considered as a useful complement to the 
cross-tabulation matrix and its global accuracy statistics in 
that they provide more detailed information that can help 
improve classification techniques by locating possible error 
clusters with greater precision. It is also important to stress 
that a visual interpretation can enable better decisions to be 
taken when evaluating and validating LUC maps. 
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Overall accuracy 


[- <= 0.78 
© 0.78 - 0.79 
E 0.79 - 0.80 
ES 0.80 - 0.82 
EB 0.82 - 0.83 
E > 0.83 


A 


0 2.5 5 75 10km 
mS mm 


Fig. 7 Results from Exercise 1. Map showing the spatial distribution of overall accuracy values 


Parameters Log | 


| lin max 
coords.xl 71 
coords.x2 1 
Is projected: TRUE 

proj4string 

[*proj-utm +zone=15 +datum=WGS84 +units=m +no_defs] 
Number of points: 747 

Data attributes: 


sum.w X.Intercept. df...x. dispersion working resids 

Min. :148.4 Min. :-4.005 Min. :3.397 Min. :1 Min. : NA 

lst Qu.:176.1 lst Qu.:-3.662 lst Qu.:4.025 lst Qu.:1 lst Qu NA 
Median :184.3 Median :-3.355 Median :4.467 Median :1 Median NA 
Mean :185.9 Mean :-3.375 Mean :4.490 Mean :1 Mean :NaN 

3rd Qu.:196.0 3rd Qu.:-3.100 3rd Qu.:5.020 3rd Qu.:1 3rd Qu NA 
Max. :231.8 Max. :-2.761 Max 5.571 Max. :1 Max NA 

NA's :747 

coefs ua 


[1] 


mM ect 


Fig. 8 Results from Exercise 1 displayed in the "output' window of the “Local accuracy assessment statistics" showing variations in 
user's accuracy 
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User accuracy 


= <= 0.61 

: 0.61 - 0.66 
= 0.66 - 0.72 
w 0.72 - 0.77 
EB 0.77 - 0.82 
E > 0.82 


A 


0 25 S 7.5 10km 
ma: mm 


Fig. 9 Result from Exercise 1. Map showing the spatial distribution of user's accuracy values 


R. Molinero-Parejo 


Parameters Log 
min max ir: 
coords.xl 719135.9 780667.9 [5| 
coords.x2 1778452.8 1836927.3 
Is projected: TRUE 
proj4string : 
[*proj-utm +zone=15 +datum=WGS84 +units=m «no defs] 
Number of points: 747 
Data attributes: 
sum.w X.Intercept. df...x. dispersion working resids 
Min. :148.4 Min. :-4.023 Min. :3.397 Min. :1 Min. : NA 
u.:-3.679 lst Qu.:4.025 lst Qu.:1 lst Qu.: NA 
:-3.529 Median :4.467 Median :1 Median NA 
.510 Mean :4.490 Mean :1 Mean :NaN 
:-3.361 3rd Qu.:5.020 3rd Qu 3rd Qu.: NA 
059 Max. :5.571 Max l Max NA 
Fig. 10 Results from Exercise 1 displayed in the "output" window of the "Local accuracy assessment statistics" showing variations in 
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Fig. 11 Result from Exercise 1. Map showing the spatial distribution of producer's accuracy values 
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Abstract 


Global general Land Use and Land Cover (LUC) datasets 
map all land uses and covers across the globe, without 
focusing on any specific use or cover. This chapter only 
reviews those datasets available for one single date, which 
have not been updated over time. Seven different datasets 
are described in detail. Two other ones were identified, 
but are not included in this review, because of its 
coarsens, which limits their utility: Mathews Global 
Vegetation/Land Use and GMRCA LULC. The first 
experiences in global LUC mapping date back to the 
1990s, when leading research groups in the field produced 
the first global LUC maps at fine scales of 1 km spatial 
resolution: the UMD LC Classification and the Global 
Land Cover Characterization. Not long afterwards, in an 
attempt to build on these experiences and take them a 
stage further, an international partnership produced 
GLC2000 for the reference year 2000. These initial 
LUC mapping projects produced maps for just one 
reference year and were not continued or updated over 
time. Subsequent projects have mostly focused on the 
production of timeseries of global LUC maps, which 
allow us to study LUC change over time (see Chapter 
“Global General Land Use Cover Datasets with a Time 
Series of Maps"). As a result, there are relatively few 


single-date global LUC maps for recent years of 
reference. The latest projects and initiatives producing 
global LUC maps for single dates have focused on 
improving the accuracy of global LUC mapping and the 
use of crowdsourcing production strategies. The 
Geo-Wiki Hybrid and GLC-SHARE datasets built on 
the previous research in a bid to obtain more accurate 
global LUC maps by merging the data from existing 
datasets. OSM LULC is an ongoing test project that is 
trying to produce a global LUC map cheaply, using 
crowdsourced information provided by the Open Street 
Maps community. The other dataset reviewed here is the 
LADA LUC Map, which was developed for a specific 
thematic project (Land Degradation Assessment in Dry- 
land). This dataset is not comparable to the others 
reviewed in this chapter in terms of its purpose and 
nature, as is clear from its coarse spatial resolution (5 arc 
minutes). We therefore believe that this dataset should not 
be considered part of initiatives to produce more accurate, 
more detailed land use maps at a global level. 
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1 UMD LC Classification—University of Maryl 


Website of reference 
https://daac.ornl.gov/ISLSCP. II/guides/umd landcover xdeg.html 
Download site 


and Land Cover Classification 


Product 
LULC general 
Dates 


1992/93 (1 km) 
1984 (8 km) 
1987 (1?) 


Formats 

Raster 

Pixel size 

] km, 8 km, 1? 
Thematic resolution 


15 Classes — 1 km products 
1 (a), 1 (ag), 10 (v), 1 (m), 1 (na) 


Compatible legends 
UMD, IGBP 

Extent 

Global 

Updating 

No 

Change detection 
No (only one date) 
Overall accuracy 
Expected to be 26596 


D. García-Álvarez et al. 


Website Language English, Spanish, French, Arabic, Russian 


http://iridl.Ideo.columbia.edu/SOURCES/.UMD/.GLCF/.GLCDS/.1c/datafiles.html 


Availability 

Open Access 

Technical documentation 
Hansen et al. (2000) 


Other references of interest 


| Format(s) 


an, .img 


DeFries and Townshend (1994), DeFries et al. (1995), Hansen and Reed (2000), McCallum et al. (2006) 


! (a): artificial; (ag): agriculture; (v): vegetation; (m): mixed classes; 
(na): no data. 


Global General Land Use Cover Datasets with a Single Date 
Project 


The Department of Geography of the University of Mary- 
land hosted one of the first research groups to use the clas- 
sification of satellite imagery for global LUC mapping. They 
initially produced an LUC map at a spatial resolution of 1 
degree for the year of reference 1987. This was followed 
sometime later by the production of a finer map at 8 km for 
1984. Finally, the project delivered a map at 1 km, which at 
that time was the finest resolution at which global LUC 
mapping had ever been carried out. 

The Global Land Cover Facility that hosted all this data 
recently went offline. This means that there is currently no 
official website that supports the datasets and provides 
information about their particular specifications. The map at 
1 km can however be downloaded from external sites. The 
earlier maps at coarser resolutions are no longer available. 


Production method 


The UMD LC was obtained through supervised classification 
with a decision tree algorithm of imagery captured by the 
AVHRR sensor. Urban and built-up areas were not mapped, nor 
were water covers. Instead, they were extracted from auxiliary 
sources. The classification obtained in this way was then 
improved in a post-classification stage by expert regional label- 
ling, based on inconsistencies that were identified by the experts. 


Product description 


Users can download the UMD LC Classification in two 
formats (.lan, .img), which are available in the section 
“GIS-Compatible Formats". The download is not easy and 
does only include the raster file with LUC information. 
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Downloads 


LAN file 
— Raster file with LUC map 


Legend and codification 


Code Label Code Label 
0 Water 8 Closed Shrubland 
1 Evergreen Needleleaf 9 Open Shrubland 
Forest 
2 Evergreen Broadleaf 10 Grassland 
Forest 
3 Deciduous Needleleaf 11 Cropland 
Forest 
4 Deciduous Broadleaf 12 Bare Ground 
Forest 
Mixed Forest 13 Urban and Built-up 
6 Woodland 14 Unclassified 
7 Wooded Grassland 


Practical considerations 


There is no official website hosting this dataset, which makes 
it more difficult to access and understand. Users must bear in 
mind that this was one of the first global LUC datasets ever 
developed and it can therefore be considered outdated in 
technical terms. 

Coarser versions of the 1 km map, resampled at 0.25, 0.5 
and 1 degree of spatial resolution, are also available.? 


? https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds id-969. 
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2 GLCC 2.0 Global—Global Land Cover Characterization 2.0 


Product 

LULC general 
Dates 

1992 / 93 

Formats 

Raster 

Pixel size 

] km 

Thematic resolution 


100 classes (Global ecosystems legend) 
19 classes (IGBP legend): 
1 (a), 1 (ag), 10 (v), 2 (m), 2 (na) 


Compatible legends 


Global Ecosystems, IGBP, USGS LULC system, SiB, SiB 2, BATS, 
Vegetation lifeforms 


Extent 

Global 

Updating 

No 

Change detection 
No (only one date) 


Overall accuracy 
Expected to be > 66% 


Website of reference Website Language English 


https://www.usgs.gov/centers/eros/science/usgs-eros-archive-land-cover-products-global-land-cover-characterization-glcc?qt-science center . 
objects-Offqt-science center objects 


Download site 

https://earthexplorer.usgs.gov/ 

Availability Format(s) 

Open Access after registration tiff, .bil 

Technical documentation 

Belward et al. (1999), Brown et al. (1999), Loveland and Belward (1997), Loveland et al. (2000), Reed et al. (2000) 
Other references of interest 

Hansen and Reed (2000) 


Global General Land Use Cover Datasets with a Single Date 
Project 


The GLCC dataset was the result of collaboration between 
several international institutions: the U.S. Geological Survey 
(USGS), the Earth Resources Observation and Science 
(EROS) Center, the University of Nebraska-Lincoln 
(UNL) and the Joint Research Centre (JRC) of the Euro- 
pean Commission. The project aimed to create a dataset of 
reference for global land monitoring. One of the LUC maps 
obtained from the project is usually referred to as the 
DISCover LUC map and follows the IGBP classification 
scheme. 

The global LUC map was created by joining various 
continental LUC maps together, and the final product con- 
sisted of a generalized global map and a set of more detailed 
continental maps. 

Two versions of the dataset have been produced so far, 
with the first being released in 1997. The second version 
(2.0) improved on the first by applying both the lessons 
learnt and user feedback. Version 1.2 of the product included 
the IGBP classification (DISCover LUC map). 


Production method 


The dataset was obtained through unsupervised classification 
(CLUSTER classifier) of AVHRR imagery at a spatial res- 
olution of 1 km. The classification obtained was further 
refined with the help of auxiliary data from the Digital 
Elevation Model (DEM), Ecoregions data and other thematic 
maps specific for each region. Label-assignment for the 
spectral classes was based on expert interpretation. 

The dataset production was split into different continents, 
according to their specific characteristics. A detailed LUC 
map was produced for each continent and these were then 
joined together to create the global LUC product. 
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Product description 


Two GLCC maps are available for download: the global 
product and the specific LUC product for each continent. 
The continental LUC maps show more detail than the global 
one and have specific legends that disaggregate the com- 
plexity of the land uses and covers for each continent. 

The data can be downloaded in two different formats (.bil, 
tiff). The download for each format includes the LUC maps 
with all the various classification schemes, together with 
technical documentation about the product. The continental 
product also includes a specific binary raster which maps the 
built-up land cover. 

The product is distributed in two different projections: the 
Goode projection and a geographic projection. 


Downloads 


Global land cover product—Goode projection (*glecgbg20 tif") 


— Raster files with LUC maps for each of the 7 classification schemes 
included in the product 
— PDF document with technical information about the product 


European land cover product—Goode projection (*gleceag20 tif") 


— Raster files with LUC maps for each of the classification schemes 
included in the product 

— Raster file with urban land cover information (built-up/non built-up) 

— PDF document with technical information about the product 


Legend and codification 


LUC maps for each continent include a specific regional 
classification scheme, which is not shown here. The global 
dataset also supports seven different classification schemes. 
The most detailed of these is the Global Ecosystems (GLCC) 
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scheme. In this case, however, we will only display the 
IGBP Land Cover classification scheme (IGBP), because it 
is the most commonly used of all the schemes provided by 


the dataset. 


Information about the codification and the meaning of all 
the other classification schemes can be found in the technical 
documentation included in the downloaded product, as well 
as in the documentation available on the project's website." 


IGBP Land Cover (IGBP) Legend 


Code Label Code 

1 Evergreen 11 
Needleleaf Forest 

2 Evergreen 12 
Broadleaf Forest 

3 Deciduous 13 
Needleleaf Forest 

4 Deciduous 14 


Broadleaf Forest 


Label 


Permanent Wetlands 


Croplands 


Urban and Built-Up 


Cropland/Natural Vegetation 
Mosaic 


(continued) 


? https://www.usgs.gov/media/files/global-land-cover-characteristics- 


data-base-readme-version2. 


IGBP Land Cover (IGBP) Legend 


Code Label 

5 Mixed Forest 

6 Closed 
Shrublands 

7 Open Shrublands 

8 Woody Savannas 

9 Savannas 

10 Grasslands 


Practical considerations 


Code 
15 
16 


17 
99 


100 
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Label 
Snow and Ice 


Barren or Sparsely Vegetated 


Water Bodies 


Interrupted Areas (Goode's 
Homolosine Projection) 


Missing Data 


For more information about the product, users are referred to 
its readme file, which explains the project history, the 
dataset production workflow and all the characteristics of the 


product. 


^ https://prd-wret.s3.us-west-2.amazonaws.com/assets/palladium/ 
production/s3fs-public/atoms/files/ 
GlobalLandCoverCharacteristicsDataBaseReadme Version2.pdf. 
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3 GLC2000—Global Land Cover 2000 


Product 

LULC general 
Dates 

2000 

Formats 

Raster 

Pixel size 

1 km 

Thematic resolution 


23 classes: 
1 (a), 1 (ag), 15 (v), 3 (m), 1 (na) 


Compatible legends 
FAO LCCS, IGBP 
Extent 

Global 

Updating 

No 

Change detection 
No (only one date) 


Overall accuracy 


Expected to be >68% 
Website of reference Website Language English 
https://forobs.jrc.ec.europa.eu/products/g1c2000/g1c2000.php 
Download site 
https://forobs.jrc.ec.europa.eu/products/glc2000/products.php 
Availability Format(s) 
Open Access .tiff, ESRI GRID, .img and Binary 
Technical documentation 
Hua et al. (2018), McCallum et al. (2006), Neumann et al. (2007), Pérez-Hoyos et al. (2012), Tchuenté et al. (2011) 
Other references of interest 
Bartholomé et al. (2002), Bartholomé and Belward (2005), Eva et al. (2004), Fritz et al. (2003) 
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Project 


GLC2000 was a project run by the Joint Research Centre 
(JRC) of the European Commission in collaboration with 
regional teams across the globe. The objective of the project was 
to create a homogeneous, coherent global LUC map that was 
suitable for environmental monitoring. The reference year 2000 
was chosen because of its particular significance for that purpose. 

One of the most successful aspects of the project was the 
coordination of different teams across the globe to produce a 
global LUC map. To this end, GLC2000 provides a global 
dataset, together with a set of more detailed regional datasets 
adapted to the specificities of each territory. 


Production method 


GLC2000 was produced by different work teams across the 
globe. To this end, the world was split into 18 different 
regions, with each team mapping either a specific region or 
an area of special interest within a region. 

A LUC map for each region was obtained through 
unsupervised classification of imagery captured by the 
VEGETATION sensor. The classifications obtained were 
then labelled by each regional team according to their local 
expertise in the area. Input for the classification varied in line 
with the particular characteristics of each region. 

Regional LUC maps were merged into the global product, 
which is a coherent and homogeneous generalized mosaic of 
the set of regional maps. However, these regional maps 
provide more detail than the global one. 


Product description 


GLC2000 consists of two main products: the harmonized 
global LUC dataset covering the whole earth and the set of 
detailed regional LUC datasets. The Global LUC map can be 
downloaded in four different formats (ESRI, Binary, Tiff, 
Img), whereas the regional maps are only available in two 
(ESRI, Binary). The product for download includes a file to 
symbolize the raster LUC map as well as auxiliary infor- 
mation to interpret the legend. 


Downloads 


GLC2000 (Global) 


— Raster file with LUC map 
— Colormap file to symbolize the raster in ArcGIS (.clr) 
— Excel spreadsheet with the map legend 
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GLurope) 


— Folder with raster file of the regional LUC map (glc_eu_v2) 
— Colormap file to symbolize the raster in ArcGIS (.clr) 
— DBF file with the map legend 


Legend and codification 


Code | Label 

1 Tree Cover, broadleaved, evergreen 

2 Tree Cover, broadleaved, deciduous, closed 

3 Tree Cover, broadleaved, deciduous, open 

4 Tree Cover, needle-leaved, evergreen 

5 Tree Cover, needle-leaved, deciduous 

6 Tree Cover, mixed leaf type 

7 Tree Cover, regularly flooded, fresh 

8 Tree Cover, regularly flooded, saline, (daily variation) 

9 Mosaic: Tree cover/Other natural vegetation 

10 Tree Cover, burnt 

11 Shrub Cover, closed-open, evergreen (with or without 
sparse tree layer) 

12 Shrub Cover, closed-open, deciduous (with or without 
sparse tree layer) 

13 Herbaceous Cover, closed-open 

14 Sparse Herbaceous or sparse shrub cover 

15 Regularly flooded shrub and/or herbaceous cover 

16 Cultivated and managed areas 

17 Mosaic: Cropland/Tree Cover/Other Natural Vegetation 

18 Mosaic: Cropland/Shrub and/or Herbaceous cover 

19 Bare Areas 

20 Water Bodies (natural and artificial) 

21 Snow and Ice (natural and artificial) 

22 Artificial surfaces and associated area 

23 No data 


Practical considerations 


Information about map metadata is easily available on the 
project's website together with technical documents 
describing the products. This information can help users gain 
a better understanding of the maps and all their specific 
characteristics, advantages and disadvantages. GLC2000 has 
also been widely analysed in the scientific literature. Users 
can find out more about the particular characteristics and the 
accuracy of the database by consulting some of the refer- 
ences of interest cited above. 
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4 Geo-Wiki Hybrid 


Product 
LULC general 
Dates 

2000/05 


Formats 


Raster 


Pixel size 
300 m 


Thematic resolution 


10 classes: 

1 (a), 1 (ag), 3 (v), 1 (m), 0 (na) 
Compatible legends 

FAO LCCS 

Extent 

Global 

Updating 

Not planned 


Change detection 


No (only one date) 
Overall accuracy 


Expected to be > 82% (87.9% for Hybrid Map 1 
and 82.896 for Hybrid Map 2) 


Website of reference Website Language English 


https://www.geo-wiki.org/ 

Download site 

https://application.geo-wiki.org/Application/index.php 

Availability Format(s) 


Open Access after registration img 


Technical documentation 
See et al. (2015) 

Other references of interest 
Fritz et al. (2012) 
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Project 


This project aimed to merge available global LUC maps to 
create a new, more accurate dataset, in a bid to enable more 
accurate global LUC mapping. Reference LUC data col- 
lected by the Geo-Wiki platform via crowdsourcing was 
employed in the fusion process, so pioneering a practice that 
has become more common in recent years. The dataset 
obtained in this way was one of the first, best-known 
examples of data fusion for global LUC mapping. 


Production method 


The hybrid map of the Geo-Wiki project was produced by 
merging three global LUC datasets: GLC2000, GlobCover 
and MODIS LC. Whereas GLC2000 shows the LUC state of 
the world for the reference year 2000, the other two sources 
provide LUC information for the reference year 2005. The 
spatial resolution of the hybrid map is the same as applied in 
the dataset with the highest resolution: GlobCover (300 m). 
The other two datasets, which had a spatial resolution of 
] km, were resampled to fit this resolution. 

For each dataset, a probability layer was produced 
indicating the probability of that source representing the 
correct LUC class on the ground. These layers were 
obtained by regressing the datasets with validation points 
created through Geo-Wiki campaigns. A Geographically 
Weighted Regression (GWR) algorithm was employed to 
this end. 

The probability layers were later merged in two differ- 
ent ways, delivering two LUC maps. For Hybrid Map 1, 
the LUC category from the dataset with the highest 
probability in the probability layers was selected. For 
Hybrid Map 2, when two LUC datasets agreed on a 
LUC category, this was selected. When the LUC 
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datasets disagreed, the LUC category from the dataset 
with the highest probability in the probability layers was 
chosen. 

Product description 

Users can download the hybrid map in a compressed folder 
(.rar) which also contains the raster layers that store the LUC 
information. No other auxiliary information is provided. 


Downloads 


Geo-Wiki Hybrid (folder) 


— A raster file with LUC information (.img) 


Legend and codification 


Code | Label Code | Label 

1 Tree cover 6 Flooded/wetland 

2 Shrub cover Fi Urban 

3 Herbaceous 8 Snow and ice 
vegetation/Grassland 

4 Cultivated and managed 9 Barren 

5 Mosaic of cultivated and 10 Open water 
managed/natural 
vegetation 


Practical considerations 


The Hybrid map is available online through the Geo-Wiki 
platform.? Although two hybrid maps were produced, only one 
was finally distributed. No information is provided as to which 
of these two maps is the one available online and for download. 


? https://www.geo-wiki.org/. 
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5 LADA LUC Map—Land Degradation Assessment in Drylands 


| Website of reference 


1 http://www.fao.org/land-water/land/land-governance/land-resources-planning-toolbox/category/details/en/c/ 1036360/ 


Open Access 
Technical documentation 
Nachtergaele and Petri (2013) 


Other references of interest 


Product 

LULC general 
Dates 

2007 

Formats 

Raster 

Pixel size 

5 arc minutes 
Thematic resolution 


| 40 classes 
1 (a), 7 (ag), 23 (v), 0 (m), 0 (na) 


| Compatible legends 
Extent 
| Global 

Updating 

No 

Change detection 

No (only one date) 

Overall accuracy 

Not specified — 
Website. Language English 


| Format(s) 
ESRI GRID, .tiff 
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Project 


Land Degradation Assessment in Dryland (LADA) is a 
project led by the Food and Agriculture Organization 
(FAO) of the United Nations that aims to assess and map 
land degradation at different scales and levels, so as to 
understand its impact on land use. As part of the datasets 
created in the project, a map of the world's Land Use Sys- 
tems (LUS) was developed. Many other datasets were also 
created within the framework of this project, which may be 
of interest to users. 


Production method 


The dataset was obtained after the interpretation of LUC 
units over a spatial dataset generated by the overlay of dif- 
ferent spatial thematic layers: the GLC2000 LUC map, 
cropland LUC maps, livestock distribution data, ecosystem 
and ecological indicators and socioeconomic factors such as 
population density. 


Product description 


The LADA LUC map can be downloaded in two different 
formats (ESRI GRID or TIF). In each case, users download 
the raster files containing the LUC information, together 
with a layer style file to symbolize the dataset in a GIS. 


Downloads 


ESRI GRID folder 


— Folder with raster files including LUC information (“lus”) 
— Folder with product metadata (“info”) 
— Layer style file for ArcGIS (lyr) 


TIF folder 


— Raster file with LUC map (.tiff) 
— Layer style file for ArcGIS (lyr) 


Legend and codification 


Label 

Forest— Virgin 

Forest—Protected 

Forest—With agricultural activities 


1 
2 
3 
4 Forest—With moderate or high livestock density 
5 Forest—A groforestry 

6 


Forest—Plantations 


(continued) 
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Code | Label 
Grasslands—Unmanaged 
Grasslands— Protected 

9 Grasslands—Low livestock density 

10 Grasslands—Moderate livestock density 

11 Grasslands—High livestock density 

12 Grasslands—Stable fed 

13 Shrubs—Unmanaged 

14 Shrubs—Protected 

15 Shrubs—Low livestock density 

16 Shrubs—Moderate livestock density 

17 Shrubs—High livestock density 

18 Shrubs—Stable fed 

19 Agricultural land—Rainfed crops (subsistence/commercial) 

20 Agricultural land—Crops and mod. Intensive livestock 
density 

21 Agricultural land—Crops and intensive livestock density 

22 Agricultural land—Crops with large scale irrigation and 
mod. Intensive or higher livestock density 

23 Agricultural land—Large-scale irrigation (>25% pixel size) 

24 Agricultural land—Protected 

25 Urban land 

26 Wetlands—Not used/not managed 

27 Wetlands—Protected 

28 Wetlands—Mangrove 

29 Wetlands—With agricultural activities 

30 Sparsely vegetated areas—Unmanaged 

31 Sparsely vegetated areas—Protected 

32 Sparsely vegetated areas—Low livestock density 

33 Sparsely vegetated areas—With mod or higher livestock 
density 

34 Barren areas—Unmanaged 

35 Barren areas—Protected 

36 Barren areas—Low livestock density 

37 Barren areas—With mod. livestock density 

38 Open water—Unmanaged 

39 Open water—Protected 

40 Open water—Inland fisheries 


Practical considerations 


The LADA LUC dataset is not a standard LUC map. It is a 
map of land use systems that was specifically created for the 
purposes of the LADA project, i.e. to study land degradation. 
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6 GLC-SHARE—Global Land Cover-SHARE 


Product 

LULC general 

Dates 

Only one date, different for each part of the Earth 
Formats 

Raster 

Pixel size 

1 km 

Thematic resolution 


11 classes: 


1 (a), 1 (ag), 6 (v), 0 (m), 0 (na) 
Compatible legends 

FAO LCCS 

Extent 

Global 

Updating 

None planned 

Change detection 


No (only one date) 


Overall accuracy 

Expected to be >80% 

Website of reference Website Language English 
http://www.fao.org/land-water/land/land-governance/land-resources-planning-toolbox/category/details/en/c/1036355/ 
Download site 


https://data.apps.fao.org/map/catalog/srv/eng/catalog.search ?uuid=ba45 26fd-cdbf-4028-al bd-5a559c4bff38 &currTab=distribution#/metadata/ 
ba4526fd-cdbf-4028-albd-5a559c4bff38 


Availability Format(s) 
Open Access „tiff, . kml, WMS 
Technical documentation 

Latham et al. (2014) 


Other references of interest 
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Project 


GLC-SHARE was a project led by the Land and Water 
Division of the Food and Agriculture Organization (FAO), 
in collaboration with other institutions across the world. It 
aimed to create a global LUC map by mixing different 
sources of LUC information available at detailed scales. The 
objective was to improve the accuracy and quality of LUC 
information, so as to have a reliable source of global LUC 
information for policymaking. 

Unlike other global LUC mapping projects, 
GLC-SHARE provides detailed LUC information in a single 
global product. Usually, this is only available in national, 
regional and local datasets. 

Although the GLC-SHARE was produced in 2014, it was 
conceived as a living database that could integrate new LUC 
datasets as they were released or updated. Its production 
method has been made public, so enabling product replication. 

As GLC-SHARE was produced by merging data from 
multiple databases, it has no specific date of reference. There 
are different dates for each part of the world, according to the 
main product that was used to map them. 


Production method 


GLC-SHARE was produced by merging and integrating 
high-quality LUC data for different areas of the world. LUC 
data at all scales (global, national, sub-national, regional) 
was used to produce the map. 

In order to merge the various LUC datasets into a single 
product, their legends had to be harmonized. When different 
products were available for the same area, the one with the 
most detailed, most accurate data was chosen. If no products 
were available at detailed or national scales, global LUC 
datasets (Globcover 2009, MODIS VCF 2010 and Cropland 
database 2012) were used instead. The main areas not cov- 
ered by high-resolution datasets included Latin America, 
West Africa, Indonesia and important parts of Asia, such as 
Thailand and the Arabian Peninsula. 

An initial map for each of the 11 land cover classes that 
make up the classification legend of the GLC-SHARE was 
obtained. Each map shows the proportion that each land 
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cover occupies in each pixel of the GLC-SHARE grid. 
Finally, from the 11 thematic rasters created, a general raster 
was obtained indicating the dominant land cover type in 
each pixel. 


Product description 


GLC-SHARE products can be downloaded in raster format 
or as a kml file to upload in Google Earth or any other GIS 
software. GLC-SHARE maps are also available through a 
WMS web service. 

Users can download the global GLC-SHARE LUC map, 
which indicates the dominant land cover type in each pixel, 
or individual LUC rasters showing the proportions of each 
LUC type in each pixel. In these rasters, the pixel value 
refers to the proportion (0-100) at which each category is 
represented in the pixel. A pixel covered exclusively by 
artificial surfaces would have a value of 100 in the 
“GLC-Share — Artificial surfaces" raster. 

Users can also download auxiliary information about the 
dataset from the website. This includes a technical report 
about the product (GLC-Share report) as well as a raster and 
an excel spreadsheet explaining which dataset was used to 
map each area of the world (GLC-Share—Sources). 


Downloads 


GLC-Share—Dominant land cover type 


— Raster file with LUC map displaying the dominant land cover type 
— Layer style file for ArcGIS (lyr) 
— Text document showing the classification legend for the dataset 


GLC-Share—Sources 


— Raster file with information about which LUC dataset was used to 
map each area of the world 

— Layer style file for ArcGIS (lyr) 

— Excel spreadsheet with information about which LUC dataset was 
used to map each area of the world 

— Text document explaining the downloaded product 


GLC-Share—Artificial surfaces 


— Raster file with information about the proportion of artificial 
surfaces in each pixel 
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Legend and codification 


Code 


ON tn fF Uc [2 — 


Label 

Artificial Surfaces 
Cropland 

Grassland 

Tree covered areas 
Shrubs covered areas 


Herbaceous vegetation, 
aquatic or regularly 
flooded 


Code 


10 
11 


Label 

Mangroves 
Sparse vegetation 
Bare soil 

Snow and glaciers 


Water bodies 


283 
Practical considerations 


GLC-SHARE is a single product with no information about 
changes in LUC over time. It was created in 2014, which 
may therefore be considered the reference year for the 
dataset. However, this date may vary a great deal between 
the different parts of the world. GLC-SHARE is therefore 
not recommended for studies or analyses of LUC change. 

Although the dataset was conceived as a live map, it has 
not been further updated with the inclusion of new LULC 
datasets since 2014. 
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7 OSM Landuse/Landcover 


Website of reference 
https://data.osmlanduse.org 
Download site 
https://data.osmlanduse.org 
Availability 

Under request (email to producers) 
Technical documentation 

Schultz et al. (2017) 


Other references of interest 


Fonte and Martinho (2017), Fonte et al. (2017a, b), Viana et al. (2019) 


Product 

LULC general 

Dates 

Only one date, which cannot be specified 
Formats 

Raster 

Pixel size 

10m 

Thematic resolution 


14 classes: 
4 (a), 3 (ag), 2 (v), 3 (m), 1 (na) 


Compatible legends 

CLC 

Extent 

Global (with gaps) / Europe (full coverage) 
Updating 
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Completion of the map is ongoing, although new editions of the map 
for different years of reference are not expected 


Change detection 

No (only one date) 

Overall accuracy 

Not specified 

Website Language English 


Format(s) 
„tiff 
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Project 


OSM Landuse/Landcover (LULC) is a LUC dataset created 
as part of the H2020 project “LandSense”, which aims to 
engage citizens in the production of LUC information. 
The OSM LULC has been developed above all by the 
GlIScience research group from Heidelberg University. 

OSM LULC is an attempt to exploit the LUC information 
contained in the OpenStreetMaps (OSM) database. It is a test 
project and therefore cannot be regarded as a final product with 
full global coverage. Nevertheless, the project has developed a 
workflow to obtain LUC information from the OSM database 
as well as a methodology for obtaining an LUC map with full 
coverage over a specific test area (Heidelberg), filling the gaps 
in the OSM via classification of satellite imagery. 


Production method 


OSM LULC was produced using a very simple method. 
Authors downloaded the OSM database and translated the 
tags that define the features stored in the database into LUC 
terms (the legend for the Corine Land Cover (CLC) survey 
was used as a reference). An equivalence table between the 
OSM tags and the CLC level 2 legend was created. 

The OSM LUC information, in vector, was generalized in 
a 30 m pixel side grid. In the event of feature overlap when 
aggregating information, preference was given to the smaller 
features. 

Gap areas not covered by the OSM database were filled with 
the LUC information obtained by a supervised classification of 
Landsat imagery with the random forest classifier. This process 
was only carried out for a European test area, leaving important 
information gaps in the rest of the global map. 

Due to the particular characteristics of the OSM database, 
LUC information is not provided for a single date. Each fea- 
ture of the database has a different date. This makes it difficult 
to determine the date of reference for each pixel in the dataset. 


Product description 
The product was initially distributed in tiles. However, users 


can also request a specific file for their area of interest by 
email. These files contain the LUC map and an Excel 
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spreadsheet with the pixel count for each category. They do 
not include the qualitative meaning of the category codes. 


Downloads 
OSM Landuse 


— Raster file with LUC map (.tiff) 
— Excel file with class codes and pixel count per class 


Legend and codification 


Code | Label Code | Label 

5 Water bodies 23 Pastures 

11 Urban fabric 31 Forests 

12 Industrial, 32 Shrub and/or herbaceous 
commercial and vegetation associations 
transport units 

13 Mine, dump and 33 Open spaces with little 
construction sites or no vegetation 

14 Artificial, 41 Inland wetlands 
non-agricultural 
vegetated areas 

21 Arable land 42 Coastal wetlands 

22 Permanent crops NA No data 


Practical considerations 


The website for this database includes a form for those who 
want to download the map. However, interested users are 
recommended to contact the map producers directly, as the 
first approach does not always work. Contact details for the 
map producers are available at the project's website.? 

Users should be aware of the limitations of this dataset. 
As there is no single reference year for all the mapped areas, 
it may be difficult to use the map as a reference when ana- 
lysing changes over time. 


ê https://osmlanduse.org/. 
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Abstract 


General Land Use Cover (LUC) datasets provide a 
holistic picture of all the land uses and covers on Earth, 
without focusing specifically on any individual land use 
category. As opposed to the LUC maps which are only 
available for one date or year, reviewed in Chap. “Global 
General Land Use Cover Datasets with a Single Date", 
the maps with time series allow users to study LUC 
change over time. Time series of general LUC datasets at 
a global scale is useful for understanding global patterns 
of LUC change and their relation with global processes 
such as climate change or the loss of biodiversity. 
MCD12Q1, also known as MODIS Land Cover, was the 
first time series of LUC maps to be produced on a global 
scale. When it was first launched in 2002, there were 
already many organizations and researchers working on 
accurate, detailed global LUC maps, although these were 
all one-off editions for single years. The MCD12Q1 
dataset continues to be updated today, providing a series 
of maps for the period 2001-2018. Since the launch of 
MCD12Q1, many other historical series of LUC maps 
have been produced, especially in the last decade. This 
has resulted in the LUC map series covering a longer time 
period at higher spatial resolution. Recent efforts have 
focused on producing consistent time series of maps that 
can track LUC changes over time with low levels of 
uncertainty. GLCNMO (500 m), GlobCover (300 m) and 
GLC250 (250 m) provide time series of LUC maps at 
similar spatial resolutions to MCD12Q1 (500 m), 
although for fewer reference years. GLCNMO provides 
information for the years 2003, 2008 and 2013, 


GlobCover for 2005 and 2009 and GLC250 for 2001 
and 2010. GLASS-GLC is the dataset with the coarsest 
spatial resolution of all those reviewed in this chapter 
(5 km), even though it was released very recently, in 
2020. Map producers have focused on this dataset’s long 
timespan (1982-2015) rather than on its spatial detail. 
LC-CCI and CGLS-LC100 are the recently launched 
datasets providing a consistent series of LUC maps, 
which show LUC changes over time with lower levels of 
uncertainty. LC-CCI provides LUC information for one 
of the longest timespans reviewed here (1992-2018) at a 
spatial resolution of 300 m. CGLS-LC100 provides LUC 
information for a shorter period (2015-2019) but at a 
higher spatial resolution (100 m). In both cases, updates 
are scheduled. The datasets with the highest levels of 
spatial detail are FROM-GLC and GLC30. These were 
produced using highly detailed Landsat imagery, deliv- 
ering time series of maps at 30 m. The FROM-GLC 
project even has a test LUC map at a spatial resolution of 
10 m from Sentinel-2 imagery for the year 2017, making 
it the global dataset with the greatest spatial detail of all 
those reviewed in this book. Both FROM-GLC and 
GLC30 provide data for three different dates: the former 
for 2010, 2015 and 2017 and the latter for 2000, 2010 and 
2020. 
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1 GLASS-GLC—Global Land Surface Satellite-Global Land Cover 


Product 

LULC general 
Dates 

1982-2015 

Formats 

Raster 

Pixel size 

5 km 

Thematic resolution 


8 classes: 
0 (a), 1 (ag), 4 (v), 0 (m), 1 (na)! 


Compatible legends 
FROM-GLC 

Extent 

Global 

Updating 

Not planned 
Change detection 


Possible, although sources of uncertainty may arise 


Overall accuracy 
Expected to be >82% 


Website of reference Website Language English 
http://data.ess.tsinghua.edu.cn/ 

Download site 

https://doi.pangaea.de/10.1594/PANGAEA.913496 

Availability Format(s) 

Open Access „tiff 

Technical documentation 

Liu et al. (2020) 


Other references of interest 


! (a): artificial; (ag): agriculture; (v): vegetation; (m): mixed classes; 
(na): no data. 


Global General Land Use Cover Datasets with a Time Series of Maps 
Project 


GLASS-GLC is the result of the research activity on LUC 
mapping carried out by a group of Chinese researchers. It is 
part of the efforts led by the Tsinghua University to map 
global LUC information, which also includes the 
FROM-GL project, reviewed later in this chapter. 

The project has delivered a series of global LUC maps at 
coarse resolution (5 km). This spatial resolution may limit 
the applicability of the dataset as, for example, it does not 
include information on impervious areas. 


Production method 


GLASS-GLC was obtained after making a supervised classifi- 
cation of AVHRR satellite imagery with the Google Earth 
Engine cloud platform. Random forest was the selected classi- 
fier. Auxiliary data, such as a Vegetation Continuous Field layer 
or a Digital Elevation Model, were also used in the classification. 

To ensure the consistency of the maps over time, the 
authors applied the "LandTrendr" method and a linear 
regression-based algorithm. These helped to detect the LUC 
changes in the imagery archive used to obtain the LUC maps. 
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Product description 
GLASS-GLC can be downloaded as a single compressed 
file. This file includes all the LUC maps for each year in the 
map series (1982-2015), as well as auxiliary data to help 
users understand the product. 
Downloads 
GLASS-GLC 


— A raster file with the LUC information for each available year (.tiff) 
— Word document with a technical description of the product 


Legend and codification 


Code Label Code Label 

10 Cropland 70 Tundra 

20 Forest 90 Barren land 
30 Grass 100 Snow/ice 
40 Shrubland 0 No data 
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2 LC-CCI—Land Cover-Climate Change Initiative 


Website of reference 
https://www.esa-landcover-cci.org/ 

Download site 
http://maps.elie.ucl.ac.be/CCI/viewer/download.php 
Availability 


Open Access after provision of 
name, institution and email 


Technical documentation 
ESA (2017) 


Other references of interest 


Product 
LULC general 
Dates 
1992-2018 
Formats 
Raster 


Pixel size 


D. Garcia-Alvarez et al. 


300 m (150 m for water bodies and 500 m for snow condition) 


MMU Changes: 1 km 
Thematic resolution 


37 classes: 
1 (a), 2 (ag), 26 (v), 4 (m), 1 (na) 


Compatible legends 

PFT, FAO LCCS 

Extent 

Global 

Updating 

Updated planned (no date) 
Change detection 

Yes 

Overall accuracy 
Expected to be >70% 


Website Language English 


Format(s) 
tiff, nc (NetCDF4) 


Bontemps et al. (2012), Hollmann et al. (2013), Hua et al. (2018), Mousivand and Arsanjani (2019), Vilar et al. (2019) 


Global General Land Use Cover Datasets with a Time Series of Maps 
Project 


The Land Cover-Climate Change Initiative is a project run 
by the European Space Agency (ESA) that seeks to create 
LUC products that meet the requirements of the Global 
Climate Observing System (GCOS) for Essential Climate 
Variables (ECV) and the Climate Modelling Community 
(CMC). It builds on the lessons learned during the Glob- 
Cover project. It also takes into account the opinion and the 
needs of users working in the climate and global land cover 
research communities, who were consulted and engaged 
with during the project. 

The purpose of the project is to deliver a time series of 
land cover data that is stable, dynamic, transparent and 
flexible. This means: first, obtaining a historical series of 
land cover maps that show the changes over time, with no 
technical errors or instability: second, the production of a 
LUC dataset with a wide range of applications; and third, the 
provision of all relevant information regarding the quality of 
the dataset. 

The project was launched in 2009 and has been devel- 
oped in different phases. The initial idea was to create a LUC 
product covering three time periods (1998—2002, 2003—2007 
and 2008-2012). Later, an improved yearly LUC product for 
the period 1992-2015 was launched, which replaced the 
previous one. Recently, this latter product has been updated 
and now includes new LUC maps for the period 2016-2018 
which are consistent with the previous series. 

Apart from LUC maps, other interesting products have 
also been created as part of the Climate Change Initiative: 
weekly image composites of the AVHRR (1992-1999, 
1 km) MERIS (2003-2012, 300m and 1km) and 
PROB A-V (2014-2015, 1 km) sensors; a static map of open 
water bodies; and three global land surface seasonality 
products characterizing the dynamics of vegetation green- 
ness, snow and burnt areas. 


Production method 


The LC-CCI LUC map series is based on a single base LUC 
map that is progressively updated and backdated. The base 
LUC layer was created by classifying a series of composite 
MERIS imagery for the period 2003-2012. A different 
classification was carried out for each year of this period, 
and the map finally obtained was a combination of all these 
classifications. This allowed them to differentiate between 
land cover states (1.e. those land features that remain stable 
over time) and land cover seasonality (i.e. natural, seasonal 
variability of land cover features that do not imply a change 
in the cover itself). 
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The classification method combined the GlobCover 
unsupervised classification chain with a machine learning 
algorithm. During the classification process, a series of 
spectrotemporal classes were identified. These were later 
labelled to LUC classes with the help of experts. The clas- 
sification was regionalized to account for regional diversity 
and local heterogeneity of land cover characteristics. 

Change detection for updating and backdating the base 
map was carried out with imagery from different sensors 
(AVHRR, SPOT, MERIS and PROVA), according to image 
availability. Changes were detected at a spatial resolution of 
1 km, and since 2013 have been delineated at 300 m. Pre- 
viously, delineation of changes at finer spatial resolutions 
had been impossible due to the lack of available images. 

As a general rule, the only changes studied were those 
between six wide categories, which are not semantically 
close to each other: agriculture, forest, grassland, wetland, 
settlement and others. These changes had to persist for at 
least two years to be considered. The purpose of these rules 
was to try to ensure the stability over time of the LUC map 
series, avoiding technical changes and noise. 


Product description 


The LC-CCI dataset is distributed in different ways. This gives 
users the flexibility to download the product that best suits their 
needs. A single LUC map in either GeoTIFF or NetCDF4 may 
be downloaded for each year of the period 1992-2015. For the 
most recent years (2016-2018), these are only available in 
NetCDF4 format. Additionally, the whole time series of maps 
for the period 1992-2015 can be downloaded as a single raster 
with multiple bands, in either of the two formats available. 
When downloading the LUC maps, users only gain 
access to the rasters with LUC information. However, other 
supplementary information is available on the project's 
website. This includes a CSV file with the legend descrip- 
tion; layer style files for displaying the rasters in common 
GIS software (ArcGIS, ENVI and QGIS); GeoTIFF files 
with information about the quality and uncertainty of the 
LUC maps time series (Quality flags); and a data package for 
users working with the Sen2Cor classification software. 


Downloads 


LC Map 2015 
— Raster file with LUC map 


LC maps full 1992-2015 series 


— Raster file with LUC maps series 
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Legend and codification 


Code 
0 

10 

11 
12 
20 
30 


40 


50 
60 
61 
62 
70 
71 
72 
80 
81 
82 
90 
100 
110 
120 
121 
122 


Label 

No data 

Cropland, rainfed 

Herbaceous cover 

Tree or shrub cover 

Cropland, irrigated or post-flooding 


Mosaic cropland (>50%)/natural vegetation (tree, shrub, 
herbaceous cover) (<50%) 


Mosaic natural vegetation (tree, shrub, herbaceous cover) 
(>50%)/cropland (<50%) 


Tree cover, broadleaved, evergreen, closed to open (>15%) 
Tree cover, broadleaved, deciduous, closed to open (>15%) 
Tree cover, broadleaved, deciduous, closed (>40%) 

Tree cover, broadleaved, deciduous, open (15—40%) 

Tree cover, needleleaved, evergreen, closed to open (>15%) 
Tree cover, needleleaved, evergreen, closed (>40%) 

Tree cover, needleleaved, evergreen, open (15-40%) 

Tree cover, needleleaved, deciduous, closed to open (>15%) 
Tree cover, needleleaved, deciduous, closed (>40% 

Tree cover, needleleaved, deciduous, open (15-40%) 

Tree cover, mixed leaf type (broadleaved and needleleaved) 
Mosaic tree and shrub (>50%)/herbaceous cover (<50%) 
Mosaic herbaceous cover (>50%)/tree and shrub (<50%) 
Shrubland 

Evergreen shrubland 

Deciduous shrubland 


(continued) 


Code Label 

130 Grassland 

140 Lichens and mosses 

150 Sparse vegetation (tree, shrub, herbaceous cover) (<15%) 


152 Sparse shrub (<15%) 


153 Sparse herbaceous cover (<15%) 


160 Tree cover, flooded, fresh or brackish water 

170 Tree cover, flooded, saline water 

180 Shrub or herbaceous cover, flooded, fresh/saline/brackish 
water 


190 Urban areas 


200 Bare areas 
201 Consolidated bare areas 
202 Unconsolidated bare areas 


210 Water bodies 


220 Permanent snow and ice 


Practical considerations 


The project is aimed at the climate change research com- 
munity and therefore provides the LUC data in the NetCDF4 
raster file format commonly used by this community. 
However, .nc files are much heavier than .tiff files. 

LUC maps for single years are easily displayed in QGIS. 
However, raster files storing the whole series of LUC maps 
for the period 1992-2015 are very heavy and are difficult to 
display in QGIS without a computer with good processing 
power. 


Global General Land Use Cover Datasets with a Time Series of Maps 


3 GLC30—GlobeLand30 


Website of reference 
http://www.globallandcover.com/home en.html 
Download site 


Product 

LULC general 
Dates 

2000, 2010, 2020 
Formats 

Raster 

Pixel size 

30m 
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Variable UMC depending on the category (3 x 3 to 10 x 10 pixels) 


Thematic resolution 


10 classes: 
1 (a), 1 (ag), 4 (v), 0 (m), 0 (na) 


Compatible legends 
GLC30 

Extent 

Global 

Updating 

Not planned 

Change detection 
Yes 

Overall accuracy 
Expected to be >78% 


Website Language English 


http://www.globallandcover.com/defaults_en.html?src=/Scripts/map/defaults/En/download_en.html&head=download&type=data 


Availability 
Open access under registration 


Technical documentation 


Format(s) 
tiff 


Chen et al. (2010, 201 1a, b, 2012, 2014, 2016), Tang et al. (2014), Xie et al. (2015), Zhu et al. (2010) 


Other references of interest 


Cao et al. (2014), Chen et al. (2013, 2017), Han et al. (2015), Jun et al. (2014), Manakos et al. (2018), Shi et al. (2016a, b), Wu et al. (2016), 


Yang et al. (2017) 
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Project 


GlobeLand30 (GLC30) is a project funded and promoted by 
the Chinese government and the National Science Founda- 
tion of China. It aims to coherently map the land uses and 
covers on the world's surface at a detailed scale, using 
images from the Landsat satellite imagery archive. 

The project initially focused on analysing the best 
methods and procedures to carry out such an ambitious task. 
It then produced a global LUC map at 30 m for the reference 
years 2000 and 2010. An update of the dataset for the year 
2020 was recently released, in which Antarctica was mapped 
for the first time. 


Production method 


GLC30 was obtained after classifying Landsat imagery 
using a pixel-object-knowledge-based (POK-based) classi- 
fication approach. Other sources of complementary imagery 
were also used for the reference years 2010 (HJ-1—China 
Environment and Disaster Reduction Satellite) and 2020 
(GF-1—China High Resolution Satellite). 

The classification was carried out independently for each 
of the mapped categories. Water bodies were mapped first, 
followed by wetlands, snow and ice, artificial surfaces, 
cultivated land, forest, scrubland, grassland, barren land and 
finally tundra. Once a LUC category had been classified, the 
pixels assigned to that category were masked for the fol- 
lowing classifications. 

Each category was classified according to a specific 
approach, adapted to the characteristics of the features being 
mapped. For most of the categories, the classification 
approach consisted of three main steps: a pixel-based clas- 
sifier, image segmentation and knowledge-based verifica- 
tion. For this last step, different sources of auxiliary 
information were used via their integration in a web-based 
data platform. 


Product description 


GLC30 is distributed in tiles. Users can separately download 
a LUC map for each tile and year of reference. The 
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download includes the LUC map in raster format, a metadata 
file and a vector file with information about the satellite 
imagery used to obtain the map. 


Downloads 


GLC30 2020 


— Raster file with LUC map (.tiff) 

— Shapefile file with information about the imagery source used in the 
LUC classification (.shp) 

— Metadata file (.xls) 


Legend and codification 


Code Label Code Label 

10 Cultivated land 60 Water bodies 

20 Forest 70 Tundra 

30 Grassland 80 Artificial surfaces 

40 Shrubland 90 Bareland 

50 Wetland 100 Permanent snow and ice 


Practical considerations 


The GLC30 LUC maps for 2000, 2010 and 2020 can also be 
accessed online through the project website," which also 
includes plenty of information about the project and various 
other datasets. These include the 2020 imagery used to map 
the latest update of the dataset and different sources of ref- 
erence data used as auxiliary information in the mapping 
process. 

There are no technical documents describing the latest 
update of the map for the year 2020. Methodological 
changes in the production of the map could have been 
implemented which could lead to errors when comparing 
with previous editions. 

The project website is not always maintained. It has been 
unattended for many months over recent years. If the website 
is not maintained, it is possible that the dataset may be not 
accessible in the future. 


? http;/www.globallandcover.com/. 


Global General Land Use Cover Datasets with a Time Series of Maps 


4 GLC250—Global Land Cover 250 m 


Website of reference 
http://data.ess.tsinghua.edu.cn/ 
Download site 
http://data.ess.tsinghua.edu.cn/ 
Availability 

Open Access 

Technical documentation 
Wang et al. (2015) 


Other references of interest 


Product 

LULC general 
Dates 

2001, 2010 
Formats 

Raster 

Pixel size 

250 m 

Thematic resolution 


25 classes: 


0 (a), 6 (ag), 7 (v), 1 (m), O (na) 


Compatible legends 
FAO-LCCS, IGBP 
Extent 

Global 

Updating 

Not expected 
Change detection 
Yes 

Overall accuracy 
Expected to be >75% 
Website Language English 


Format(s) 
„tiff 


295 


296 
Project 


This product forms part of the project led by Tsinghua 
University to effectively map land uses and covers across the 
world, which mainly focused on FROM-GLC and the pro- 
duction of thematic LUC databases. Several of these datasets 
were used in the production of GLC250. The classification 
legend for GLC250 was also taken from FROM-GLC. 


Production method 


GLC250 was obtained after the classification of MODIS 
imagery (MOD13Q1) with a random forest classifier fed with 
auxiliary data: slope, latitude, MODIS vegetation indexes. 
For each year of reference (2001, 2010), a classification was 
carried out for three different dates: the year of reference, the 
year before and the year after. For the year 2001, for example, 
images from 2000, 2001 and 2002 were classified. 

The three probability maps obtained after the classifica- 
tion carried out for each year of reference were processed 
through a spatial-temporal consistency model (MAP-MRF) 
to improve the LUC classification. The final LUC map was 
improved in a post-classification phase through a rule-based 
label adjustment method using auxiliary data from MODIS 
Vegetation Continuous Fields (MOD44B), slope and 
Enhanced Vegetation Index series. 


Product description 


A map for each year of reference can be downloaded in a 
single compressed file. Each file contains all the raster files 
that make up the LUC map for each year of reference. To 
this end, the global map is split into multiple tiles following 
the MODIS tile grid.* 


? https://modis-land.gsfc.nasa.gov/MODLAND_grid.html. 
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Downloads 


GLC250—2010 


— Raster files with a LUC map for each tile making up the MODIS tile 
grid (296 files) 


Legend and codification 


The GLC250 classification scheme is the same as that 
developed for FROM-GLC. It is a two-level classification 
scheme, which allows the LUC map to be displayed at two 
different levels of detail. Only the most detailed scheme 
(Level 2) is displayed here. Interested users can consult the 
correspondence between Level 2 and Level 1 classes on the 
project's website." 


Code | Label Code | Label 

11 Rice fields 42 Other shrublands 
12 Greenhouse farming 61 Lake 

13 Other croplands 62 Reservoir/pond 

14 Seasonal croplands 63 River 

15 Pastures 64 Ocean 

21 Broadleaf forests 91 Dry salt flats 

22 Needleleaf forests 92 Sandy areas 

23 Mixed forests 93 Exposed bare rock 
24 Orchards 94 Dry lake/river bottoms 
31 Marshland 95 Tidal area 

32 Herbaceous tundra 101 Snow 

33 Other grasslands 102 Ice 

41 Shrub and brush tundra 


^ http://data.ess.tsinghua.edu.cn/. 
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5 MCD12Q1—MODIS/Terra + Aqua Land Cover Type 


Website of reference 
https://Ipdaac.usgs.gov/products/mcd12q1 v006/ 
Download site 
https://Ipdaac.usgs.gov/products/mcd12q1v006/ 
Availability 

Open access under registration 


Technical documentation 


Product 

LULC general 
Dates 

2001-2020 

Formats 

Raster 

Pixel size 

500 m, 1 km, 0.05? 
Thematic resolution 


18 classes (IGBP legend): 
1 (a), 1 (ag), 10 (v), 2 (m), 1 (na) 


Compatible legends 

IGBP, UMD, LAI, BGC, PFT, FAO-LCCS 
Extent 

Global 

Updating 

Expected 

Change detection 

Not recommended 

Overall accuracy 
Expected to be >71% 
Website Language English 


Format(s) 
.hdf 


Friedl et al. (2002, 2010), Friedl and Sulla-Menashe (2019), Sulla-Menashe et al (2019) 


Other references of interest 


Fritz and See (2005), Giri et al. (2005), Hao and Gen-Suo (2014), Tchuenté et al. (2011) 
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Project 


MCD12Q1, also known as MODIS Land Cover type, dates 
back to 2002, after the launch into space of the TERRA 
satellite carrying the MODIS sensor. The MODIS sensor 
provided a new source of imagery for global LUC mapping. 
This led to the appearance of the MODIS Land Cover pro- 
ject, which aimed to produce a yearly series of LUC maps 
that could satisfy the demands of different communities 
interested in climate and environmental monitoring at global 
or very coarse scales. At the time the dataset was launched, 
only a few global LUC datasets were available, usually at 
coarser resolutions. 

MODIS Land Cover was created by a team led by the 
University of Boston. Since 2002, six versions of the product 
have been developed. The latest is MODIS Land Cover 
Collection 6, which has included the most important changes 
in the production method of the dataset since its early 
developments. 

A complementary product at coarser resolution has been 
developed as part of the same project: MCD12C1 (0.05 Deg). 


Production method 


MCDI2QI was obtained by means of supervised classifi- 
cation (Random Forests) of MODIS imagery for the period 
2001—2020. Once the classification had been obtained for 
each year, it was adjusted with the aid of auxiliary data: C5 
MCD12Q1, C6 MODIS Land Water mask, C5 MODIS 
Vegetation Continuous Fields (VCF), WorldClim dataset, a 
global urban layer and global crop type information com- 
piled from census data. 

As a result of the classification, class probability rasters 
were obtained for each LUC category. These inform about 
the probability of each pixel belonging to a specific LUC 
category. These probability layers provided a base on which 
to map LUC covers according to six different classification 
schemes: IGBP, UMD, LAI, BGC, PFT and FAO-LCCS. In 
order to ensure the consistency of the classification over 
time, a hidden Markov model (HMM) was applied to the 
adjusted classification to reduce spurious changes over time. 


Product description 


MCD12Q1 may be downloaded through different servers or 
tools: AppEEARS, Data Pool, NASA Earthdata Search, 
USGS EarthExplorer, OPeNDAP, DAAC2Disk Utility and 
LDOPE. Depending on the server or tool chosen, users can 
download the product as a single file for each year of ref- 
erence or in tiles for specific areas of interest. 

The download includes the raster file with LUC data in 
six different classification schemes and PDF documents with 
the technical specifications for the product. 
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Downloads 


MCD12Q1 (500 m) 


— Raster file with multiple bands, including LUC information in five 
different classification schemes and data quality (.hdf) 
— PDFs with technical information about the product 


Legend and codification 


MCDI2QI is distributed for six different, widely used 
classification schemes. The only one displayed here is the 
IGBP scheme, which is one of the most commonly used. 
However, more information about the codes and class 
descriptions for the other classification legends is available 
in the user guide for this dataset (Sulla-Menashe et al. 2019). 


MCD12Q1—IGBP (International Geosphere-Biosphere Programme) 


Code | Label Code | Label 

1 Evergreen 10 Grasslands 
needleleaf forests 

2 Evergreen broadleaf |11 Permanent wetlands 
forests 

3 Deciduous 12 Croplands 
needleleaf forests 

4 Deciduous 13 Urban and built-up lands 
broadleaf forests 

5 Mixed forests 14 Cropland/natural 

vegetation mosaics 

6 Closed shrublands 15 Permanent snow and ice 

7 Open shrublands 16 Barren 

8 Woody savannas 17 Water bodies 

9 Savannas 18 Unclassified 


Practical considerations 


Users can consult the dataset online through the Web Map 
Service (WMS) available here." The dataset is also available 
at a spatial resolution of 0.05 : MCDI2CI (0.05 Deg). 

This dataset is not recommended for the study of LUC 
change, because of the high technical variability in LUC 
covers from one year to the next. 


5 https://Ipdaacgis.cr.usgs.gov/arcgis/rest/services/WMS ?f-pjson. 
ê https://Ipdaac.usgs.gov/products/mcd12c1 v006/. 
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6 GLCNMO- —Global Land Cover by National Mapping Organization 


Website of reference 
https://globalmaps.github.io/glcnmo.html 

Download site 
https://globalmaps.github.io/glcnmo.html 
Availability 

Open Access 

Technical documentation 

Kobayashi et al. (2017), Tateishi et al. (2011, 2014) 
Other references of interest 

Hua et al. (2018) 


Product 

LULC general 
Dates 

2003, 2008, 2013 
Formats 

Raster 

Pixel size 


1 km (2003) 
500 m (2008, 2013) 


Thematic resolution 


20 classes: 
1 (a), 3 (ag), 11 (v), 2 (m), 0 (na) 


Compatible legends 
FAO LCCS 

Extent 

Global 

Updating 

Not planned 

Change detection 
No 

Overall accuracy 
Expected to be >75% 


Website Language English 


Format(s) 
„tiff 
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Project 


GLCNMO is a project promoted by the International 
Steering Committee for Global Mapping (ISCGM) in col- 
laboration with the Geospatial Information Authority of 
Japan (GSI), Chiba University and national mapping orga- 
nizations from different participant countries. It is part of a 
wider effort to create global datasets on different subjects, 
including land cover and land use. 

The project has delivered three global LUC maps. Each 
one was produced at a different time and various method- 
ological changes were introduced between the production of 
each map. The most evident one was the change in spatial 
resolution after the 2003 map. Another important difference 
was the number of countries taking part in each edition of 
the map: 40 countries took part in the production of the 2003 
map, 14 in the 2008 map and 22 in the map for 2013. 

The ISCGM was wound up in 2016 and its data was trans- 
ferred to the Geospatial Information Section in the United 
Nations. We, therefore, do not expect any updates on this project. 


Production method 


The three LUC maps were produced at the continental level 
using a mixture of different methods. The maps for each 
continent were prepared by separate groups, with national 
experts providing assistance for each case. 

Most of the categories (14 in 2003 and 2008 and 11 in 2013) 
were obtained through supervised classification of MODIS 
imagery. The training samples for the classifier were selected 
with great care using photointerpretation from sources like 
Google Earth and other auxiliary data. Different classifiers were 
used for the different maps. Whereas the map for 2003 was 
produced using a maximum likelihood classifier, the ones for 
2008 and 2013 were based on a decision tree classifier. 

The remaining categories that were not classified using 
the method described above were individually mapped 
according to different procedures adapted to the specific 
needs of each category. These were urban, tree open, man- 
grove, wetland, snow/ice and water in 2003 and 2008. In 
addition to those, herbaceous areas, forests and agricultural 
areas were also mapped in this way in 2013. The strategies 
used to map these categories also varied in the different 
editions of the map, mainly involving specific classification 
methods of MODIS imagery, as well as the use of additional 
information, such as population density datasets, thematic 
MODIS products and other global LUC maps. 
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Product description 


The GLCNMO LUC map is distributed individually for each 
available year. The map for each year is split into four tiles, 
which can be downloaded in different zipped files. No other 
additional information is provided, except for the scientific 
papers presenting each map. 


Downloads 


GLCNMO (version 3) 


— Raster LUC map covering North America, the north of South 
America and the west of Europe and Africa (1 1) 

— Raster LUC map covering Europe, the east of Africa and Asia (1. 2) 

— Raster LUC map covering South America and the west of the 
Antarctic (2. 1) 

— Raster LUC map covering Africa, the south of Asia and Oceania 
(2.2) 


Legend and codification 


Code | Label Code | Label 
1 Broadleaf evergreen 11 Cropland 
forest 
2 Broadleaf deciduous 12 Paddy field 
forest 
3 Needleleaf evergreen 13 Cropland/other 
forest vegetation mosaic 
4 Needleleaf deciduous 14 Mangrove 
forest 
Mixed forest 15 Wetland 
6 Tree open 16 Bare area, consolidated 
(gravel, rock) 
7 Shrub 17 Bare area, 
unconsolidated (sand) 
8 Herbaceous 18 Urban 
9 Herbaceous with 19 Snow/ice 
sparse tree/shrub 
10 Sparse vegetation 20 Water bodies 


Practical considerations 


As there are no auxiliary datasets or documentation, users 
who require more detailed information about the character- 
istics of the dataset should consult the scientific papers cited 
above (14.6 Technical Documentation). 
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7 GlobCover 


Product 

LULC general 
Dates 

2005, 2009 
Formats 

Raster 

Pixel size 

300 m 

Thematic resolution 


23 classes: 
1 (a), 2 (ag), 14 (v), 4 (m), 1 (na) 


Compatible legends 
FAO LCCS 

Extent 

Global 

Updating 

Not planned 
Change detection 


Not recommended 


Overall accuracy 
Expected to be 278.096 


Website of reference Website Language English 
http://due.esrin.esa.int/page_globcover.php 

Download site 

http://due.esrin.esa.int/page_globcover.php 

Availability Format(s) 

Open Access „tiff 

Technical documentation 

Bicheron et al. (2008), Bontemps et al. (2011) 

Other references of interest 

Defourny et al. (2010) 
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Project 


GlobCover is a project run by the European Space Agency 
(ESA) in collaboration with the Joint Research Centre 
(JRC) of the European Commission, the European Environ- 
ment Agency, the FAO, the UN Environment Programme 
(UNEP), the Global Observations of Forest Cover Land-use 
Dynamics (GOFC-GOLD) programme and the International 
Geosphere-Biosphere Programme (IGBP). It started in 2005 
and produced two global LUC maps for the reference years 
2005 and 2009. The Université Catholique de Louvain 
(UCL) also contributed to the 2009 edition of the map. 

The aim of the project was to develop global LUC maps 
using images from the MERIS sensor onboard the ENVI- 
SAT satellite. At the time it was launched, the 2005 Glob- 
Cover map was the first global LUC map at a spatial 
resolution of 300 m. 

Based on the results of GlobCover, the ESA launched a 
new project called GlobCorine in which two new LUC maps 
compatible with the Corine Land Cover classification legend 
were created for Europe from the same imagery. The 
LC-CCI project from the ESA (see Sect. 2) builds on the 
progress made and the lessons learnt during the GlobCover 
project. 


Production method 


GlobCover maps were obtained by classifying imagery 
captured by the MERIS sensor. Urban and wetland areas, 
which are not well represented, were classified using a 
supervised classifier. The remaining categories were classi- 
fied in a series of spectro-temporal classes through an 
unsupervised classifier. Once classified, the spectro-temporal 
classes were labelled automatically according to the infor- 
mation provided by the reference datasets. For the 2005 
map, the reference datasets were the GLC2000 global LUC 
map (see Sect. 3 in Chap. *Global General Land Use Cover 
Datasets with a Single Date" Global General Land Use 
Cover Datasets with a Single Date) and other high-quality 
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regional LUC maps. For the 2009 map, the GlobCover 2005 
map was used as a reference. 

The area for classification was divided into different 
regions, to account for the ecological and reflectance 
diversity of the world. Once labelled after classification, the 
LUC map was finally edited to account for inaccuracies in 
the representation of certain features. 

For the 2005 version, regional maps with a more detailed 
legend were also produced following the same classification 
procedure. 


Product description 


A zipped file is available for each GlobCover map. It con- 
tains the raster layer with the LUC information and all the 
auxiliary data that users may need to correctly interpret the 
dataset. This includes the classification legend, technical and 
data quality information, and files with the layer style of the 
map to automatically symbolize the raster in GIS software. 
A complementary raster detailing the source of the LUC 
information for each pixel (MERIS sensor classification 
(value = null) or a land cover database (value = 1)) is also 
provided. In a separate file, users can also download a raster 
for a coloured version of the LUC map. 


Downloads 


GlobCover 


— Raster file with LUC map 
(^GLOBCOVER L4 200901 200912 V2.3") 
— Raster file with quality information 
(*^GLOBCOVER L4 200901 200912 V2.3 CLA QL") 
— Preview image of the product 
— Excel sheet with the map legend (*Globcover2009. Legend") 
— Layer style files for ArcGIS (.lyr) and ENVI (.dsr) 
— PDFs with technical information about the product 


GlobCover coloured 


— Raster file with coloured version of LUC map 
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Legend and codification 


Code 
11 
14 
20 


30 


40 


50 
60 


70 
90 


100 


110 
120 
130 


140 


Label 
Post-flooding or irrigated croplands (or aquatic) 
Rainfed croplands 


Mosaic cropland (50-70%)/vegetation 
(grassland/shrubland/forest) (20-50%) 


Mosaic vegetation (grassland/shrubland/forest) (50-70%)/ 
cropland (20-50%) 


Closed to open (21596) broadleaved evergreen or 
semi-deciduous forest (25 m) 


Closed (24096) broadleaved deciduous forest (25 m) 


Open (15-4096) broadleaved deciduous forest/woodland 
(55 m) 


Closed (24096) needleleaved evergreen forest (25 m) 


Open (15-4096) needleleaved deciduous or evergreen forest 
(55 m) 


Closed to open (215926) mixed broadleaved and 
needleleaved forest (25 m) 


Mosaic forest or shrubland (50—70946)/grassland (20-50%) 
Mosaic grassland (50—70%)/forest or shrubland (20-50%) 


Closed to open (>15%) (broadleaved or needleleaved, 
evergreen or deciduous) shrubland (<5 m) 


Closed to open (>15%) herbaceous vegetation (grassland, 
savannas or lichens/mosses) 


(continued) 


Code 
150 
160 


170 


180 


190 
200 
210 
220 
230 
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Label 
Sparse (<15%) vegetation 


Closed to open (>15%) broadleaved forest regularly flooded 
(semi-permanently or temporarily)—fresh or brackish water 


Closed (>40%) broadleaved forest or shrubland 
permanently flooded—saline or brackish water 


Closed to open (>15%) grassland or woody vegetation on 
regularly flooded or waterlogged soil—fresh, brackish or 
saline water 


Artificial surfaces and associated areas (urban areas 250906) 
Bare areas 

Water bodies 

Permanent snow and ice 


No data (burnt areas, clouds...) 


Practical considerations 


Eleven regional maps with more detailed classification 
schemes were developed as part of the GlobCover Project 
for 2005. These maps were produced using the same 
methodology as the global GlobCover, but provided more 
thematic detail. Unfortunately, they are currently unavailable 
for download. 
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8 FROM-GLC—Finer Resolution Observation and Monitoring of Global Land Cover 


Product 

LULC general 
Dates 

2010, 2015, 2017 


Formats 


Raster 


Pixel size 


250 m, 500 m, 1 km, 5 km, 25 km, 50 km, 100 km (2010) 
30 m (2010, 2015, 2017) 
10 m (2017) 


Thematic resolution 


8 classes (2017): 
1 (a), 1 (ag), 3 (v), 0 (m), 0 (na) 


Compatible legends 


Extent 
Global 
Updating 


Not planned 


Change detection 


Not recommended 


Overall accuracy 
Expected to be >65% 


Website of reference Website Language English 


http://data.ess.tsinghua.edu.cn/ 

Download site 

http://data.ess.tsinghua.edu.cn/ 

Availability Format(s) 
Open Access „tiff 
Technical documentation 

Chen et al. (2019), Gong et al. (2013), Yu et al. (2013, 2014) 

Other references of interest 

Ji et al. (2015), Xu et al. (2019) 


Global General Land Use Cover Datasets with a Time Series of Maps 
Project 


FROM-GLC was a project funded by Chinese research and 
innovation programmes that was led by Tsinghua Univer- 
sity. It brought together researchers from Chinese and other 
international institutions. 

The goal of this project was to produce global LUC 
datasets at medium to high spatial resolution. When the 
project started, there were no global LUC maps available at a 
resolution of 30 m using images from the Landsat archive. 
Maps at that resolution are useful for different user com- 
munities working in cross-regional and cross-national areas 
at that level of detail. The aim of FROM-GLC was therefore 
to provide new sources of data for modelling communities 
that required detailed global datasets. Global LUC maps at 
detailed scales are also useful for countries for which no 
other detailed LUC datasets are available. 

Three global LUC maps at three different time points 
(2010, 2015 and 2017) were created as part of this project. 
Three LUC maps are available for the year 2010. The 
original (FROM-GLC) was successively improved by 
changes in the production method, producing maps known 
as FROM-GLC-egg and FROM-GLC-agg, the latter being 
the final, most updated version. It is available at the original 
(30 m) and 7 other spatial resolutions: 250 m, 500 m, 1 km, 
5 km, 25 km, 50 km and 100 km. Unlike the maps for 2010 
and 2015, the one for 2017 was produced at two spatial 
resolutions: 10 and 30 m. 

The research team involved in the production of 
FROM-GLC has also taken part in related projects to pro- 
duce other national, regional and thematic LUC maps, most 
of them at fine spatial resolutions. These maps can be 
accessed through the project website and include national 
maps of China or Chile, thematic maps about water covers 
and other global LUC datasets. 


Production method 


Each FROM-GLC map was produced using a different 
method. The maps for 2010, 2015 and 2017 at 30 m were 
produced using a supervised classification of Landsat imagery. 

Four different classifiers were compared in the production 
of FROM-GLC for 2010. The first improved version of 
FROM-GLC, known as FROM-GLC-egg, included an 
image-segmentation method in the classification process and 
used two different classifiers. In addition, impervious sur- 
faces were individually mapped. For its part, 
FROM-GLC-agg was obtained by combining the previous 
two LUC maps (FROM-GLC, FROM-GLC-egg) using a 
decision tree algorithm. Impervious surfaces were remapped 
according to the information provided by the Nighttime 
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Light Impervious Surface Area (NL-ISA) and the MODIS 
urban extent (MODIS-urban) datasets. Once the 
FROM-GLC-agg map had been obtained at 30 m, it was 
then aggregated at seven other spatial resolutions through 
majority aggregation and proportion aggregation 
approaches. 

The map for 2017 at 10m was obtained through a 
supervised classification of Sentinel-2 imagery with a ran- 
dom forest classifier in the Google Earth Engine. 


Product description 


The FROM-GLC LUC maps are not provided as a single 
global file. To facilitate downloading of the product, the 
world is split into different tiles. Users can download the tile 
corresponding to their area of interest according to its lati- 
tude and longitude values. 

FROM-GLC products for the year 2010 can also be 
downloaded through an assisted kmz layer. When uploading 
it in Google Barth, users can visualize their area of interest 
and automatically download the map corresponding to that 
area. 


Downloads 


FROM-GLC-agg (2010) 
— Raster file with LUC map 


FROM-GLC-agg hierarchy (2010) 


— Raster file with LUC map at 30 m 
— Raster file with LUC map at 250 m obtained by majority 
aggregation 
— Raster file with LUC map at 500 m obtained by majority 
aggregation 
— Raster file with LUC map at 1 km obtained by majority aggregation 
— Raster file with LUC map at 5 km obtained by majority aggregation 
— Raster file with LUC map at 5 km obtained by proportion 
aggregation 
— Raster file with LUC map at 10 km obtained by majority 
aggregation 
— Raster file with LUC map at 10 km obtained by proportion 
aggregation 
— Raster file with LUC map at 25 km obtained by majority 
aggregation 
— Raster file with LUC map at 25 km obtained by proportion 
aggregation 
— Raster file with LUC map at 50 km obtained by majority 
aggregation 
— Raster file with LUC map at 50 km obtained by proportion 
aggregation 
— Raster file with LUC map at 100 km obtained by majority 
aggregation 
— Raster file with LUC map at 100 km obtained by proportion 
aggregation 
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FROM-GLC (2015) 
— Raster file with LUC map 


FROM-GLC 30 m (2017) 
— Raster file with LUC map 


FROM-GLC 10 m (2017) 
— Raster file with LUC map 


Legend and codification 


A specific two-level classification scheme legend was ini- 
tially developed for the FROM-GLC project in 2010. This 
was updated with various changes for the FROM-GLC map 
for 2015. The map for 2017 has the simplest, least detailed 
classification legend (Level 1). In each case, we include the 
most detailed classification scheme available for each year. 
Users can consult the correspondence between level 2 and 
level 1 of the classification scheme for the years 2010 and 
2015 at the project website." 


FROM-GLC (2010) 


FROM-GLC (2015) 
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Code Label Code Label 

11 Rice paddy 41 Shrubland, leaf-on 
12 Greenhouse 42 Shrubland, leaf-off 
13 Other 51 Marshland 

14 Orchard 52 Mudflat 

15 Bare farmland 53 Marshland, leaf-off 
21 Broadleaf, leaf-on 60 Water 

22 Broadleaf, leaf-off 71 Shrub and brush tundra 
23 Needleleaf, leaf-on 72 Herbaceous tundra 
24 Needleleaf, leaf-off 80 Impervious surface 
25 Mixed leaf, leaf-on 90 Bareland 

26 Mixed leaf, leaf-off 92 Bareland 

31 Pasture 101 Snow 

32 Natural grassland 102 Ice 

33 Grassland, leaf-off 120 Cloud 
FROM-GLC (2017) 

Code Label Code Label 

1 Cropland 6 Water 

2 Forest 8 Impervious 
3 Grass 9 Bareland 

4 Shrubland 10 Snow/ice 


Practical considerations 


Code Label Code Label 

11 Rice 62 Pond 

12 Greenhouse 63 River 

13 Other 64 Sea 

39 Crop in urban 69 Water in urban 
21 Broadleaf 71 Shrub 

22 Needleleaf 72 Grass 

23 Mixed 81 High albedo 

24 Orchard 82 Low albedo 

29 Forest in urban 9] Saline-Alkali 

31 Managed 92 Sand 

32 Nature 93 Gravel 

39 Grass in urban 94 Bare Cropland 

40 Shrubland 95 Dry river/lake bed 
49 Shrub in urban 96 Other 

51 Grass 99 Bareland in urban 
52 Silt 101 Snow 

59 Wetland in urban 102 Ice 

61 Lake 120 Cloud 


The project website, where all the information is stored and 
available for download, is not user-friendly. It is not easy to 
find the information the user is looking for. Users may also 
struggle to download datasets for their area of interest 
according to latitude and longitude information. When 
available, we recommend using the kmz file with Google 
Earth for this purpose. 

There is little additional information. For a complete 
description of the characteristics of the different maps, we 
recommend users to read the scientific papers cited in the 
introduction to this dataset above (14.8. Technical 
Documentation). 


* Categories only available in the FROM-GLC-Hierarchy product are 
shown in italics 


7 http://data.ess.tsinghua.edu.cn/. 


Global General Land Use Cover Datasets with a Time Series of Maps 307 


9 CGLS-LC100— Copernicus Global Land Service Dynamic Land Cover Map 


Product 
LULC general 
Dates 
2015-2019 


Formats 


Raster 


Pixel size 
100 m 


Thematic resolution 


24 classes: 
1 (a), 1 (ag), 18 (v), 2 (m), 1 (na) 


Compatible legends 
FAO LCCS 

Extent 

Global 

Updating 


Yes, every year 


Change detection 


Possible, although sources of uncertainty may arise 


Overall accuracy 
Expected to be >80% 


Website of reference Website Language English 


https://land.copernicus.eu/global/products/lc 


Download site 


https://Icviewer.vito.be/download 

Availability | Format(s) 
Open Access | „tiff 
Technical documentation 

Buchhorn et al. (2020a, b, c), Tsendbazar et al. (2019, 2020) 


Other references of interest 
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Project 


CGLS-LC100 is one of the deliverables produced as part of 
the Copernicus Global Land Service (CGLS), which aims to 
provide a series of bio-geophysical products to monitor land 
surface at a global scale. In addition to this LUC package, 
the programme produces other relevant variables, such as the 
Leaf Area Index (LAI), the Fraction of Absorbed Photo- 
synthetically Active Radiation (FAPAR), the Land Surface 
Temperature, soil moisture and other vegetation indices. 

The first version of CGLS-LC100 was released in 2017, 
mapping LUC for Africa. Since then, several updates of the 
product have improved the production methodology and 
extended its temporal and geographical coverage. The last 
version of the product (Collection 3), released in 2021, 
covers the whole world for the period 2015-2019. It 
includes a method for detecting land cover change that 
addresses the main sources of technical uncertainty when 
studying change in a time series of LUC maps. 

In addition to the LUC map described here, the product 
also includes a series of continuous field layers or “fraction 
maps" for the basic LUC classes mapped. Future updates of 
the product are expected on an annual basis, using the 
imagery provided by the Sentinel satellite missions. 


Production method 


The Copernicus Global Land Service Dynamic Land Cover 
map is produced through a multistep processing framework. 
First, PROBA-V satellite images are pre-processed and 
merged following a Sentinel-2 tiling grid to create a 3-year 
epoch mosaic for each reference year. Second, a series of 
metrics (spectral and textural metrics, descriptive statistics) 
are extracted from each epoch mosaic. Third, imagery for all 
the epochs is classified using a regression algorithm, which 
delivers a cover fraction layer for each basic LUC class and 
reference year, and a supervised classification algorithm, 
which delivers a LUC map for each reference year. 

Various auxiliary data sources are used in the classifica- 
tion phase, i.e. seven different data masks and three extra 
datasets: biome clusters, water cover fractions and built-up 
cover fractions. 

In order to ensure the temporal consistency of the LUC 
map series, it was decided to include a temporal post- 
processing phase in the production of the dataset. This consists 
of a BFAST break detection algorithm and a Hidden Markov 
Model. The former is used to detect changes in an independent 
time series of MODIS NIRv imagery, while the latter is used 
to rule out technical changes in the classified epoch images. 
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Product description 


CGLS-LC100 is distributed in tiles, following the Sentinel-2 
tiling grid (110 x 110 km). For each tile, users can down- 
load many different layers: the discrete classification con- 
taining the LUC map for the selected area; a layer with the 
classification probability; layers of cover fractions for each 
of the basic LUC classes mapped; a layer showing the level 
of confidence for the change measured between the different 
years in each pixel; and two extra layers: forest types and 
input data density. 

The download of the LUC map only includes the raster 
file with the LUC information. Each reference year must be 
downloaded separately. 


Downloads 


Land Cover classification—discrete classification 


— Raster file with LUC map 


Cover fractions—bare and sparse vegetation 


— Raster file with the cover fraction for the land cover under 
consideration 


Land Cover changes—change confidence 


— Raster file indicating the reliability of the change in the discrete 
class 


Others—forest types 


— Raster file indicating for all pixels with a cover fraction >1% the 
type of forest represented in the pixel 


Legend and codification 


Land Cover classification-discrete classification 


Code | Label Code | Label 
0 No input data 113 Closed forest, 
deciduous needle leaf 
20 Shrubs 114 Closed forest, 
deciduous broad leaf 
30 Herbaceous vegetation 115 Closed forest, mixed 
40 Cultivated and 116 Closed forest, 
managed unknown 
vegetation/agriculture 
(cropland) 
50 Urban/built up 121 Open forest, evergreen 


needle leaf 
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Land Cover classification-discrete classification 
Code | Label Code | Label 
60 Bare/sparse vegetation 122 Open forest, evergreen 
broad leaf 
70 Snow and ice 123 Open forest, deciduous 
needle leaf 
80 Permanent water 124 Open forest, deciduous 
bodies broad leaf 
90 Herbaceous wetland 125 Open forest, mixed 
100 Moss and lichen 126 Open forest, unknown 
111 Closed forest, 200 Open sea 
evergreen needle leaf 
112 Closed forest, 113 Closed forest, 


evergreen, broad leaf deciduous needle leaf 


Cover fractions—bare and sparse vegetation 


Code Meaning 

0-100 Percentage of the pixel (0—10096) covered by the land 
cover under consideration 

200 Masked sea 


Land Cover changes—change confidence 


Code Change confidence Code Change confidence 

0 No change 2 Medium confidence 

1 Potential change 3 High confidence 

Others—forest types 

Code | Forest type Code | Forest type 

0 Unknown 3 Deciduous, needle leaf 

forest (DNF) 

1 Evergreen, needle leaf 4 Deciduous, broad leaf 
forest (ENF) forest (DBF) 

2 Evergreen, broad leaf 5 Mixed 


forest (EBF) 
Practical considerations 


Because of the large number of datasets available through 
this project, users are encouraged to make use of the dif- 
ferent layers of LUC information available. This will give 
them a better understanding of the uncertainties and limita- 
tions of the product. 

Users can download the product covering the whole 
globe, which is distributed through the files in the Zenodo 
repository.® 


82015: https://doi.org/10.528 1/zenodo.3939038; 2016: https://doi.org/ 
10.528 1/zenodo.3518026; 2017:  https://doi.org/10.5281/zenodo. 
3518036; 2018:  https:;//doi.org/10.5281/zenodo.3518038; 2019: 
https://doi.org/10.528 1/zenodo.3939050. 
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Abstract 


The land uses and covers of Europe are the most 
systematically mapped in the world today, and their 
associated datasets offer the greatest spatial and thematic 
detail. Thanks to the work done within the Copernicus 
Land Monitoring programme run by the European 
Environmental Agency (EEA) and the Joint Research 
Centre (JRC) of the European Commission, there are 
many general LUC datasets covering most of the 
European continent. These general datasets map all land 
uses and covers on the ground, without focusing on any 
specific type. However, whereas some cover the whole of 
Europe, others only map specific local areas of interest, 
such as urban or coastal areas, riparian zones or spaces 
protected under the Nature 2000 network. CORINE Land 
Cover (CLC) is the flagship European LUC mapping 
programme and a reference worldwide. It has provided 
consistent LUC information at a detailed scale 
(1:100,000) every 6 years since 1990. This is the result 
of a high degree of coordination between many different 
organizations and institutions across Europe. The Coper- 
nicus programme also includes other European datasets 
such as Urban Atlas, N2K, Riparian Zones and Coastal 
Zones, which provide very detailed LUC information at 
higher levels of spatial detail (scale 1:10,000) for specific 
geographical area types: Functional Urban Areas, the 
Natura 2000 network, riparian zones from Strahler level 
2-8 rivers and areas 10 km away from the coastline. 
However, these projects do not cover the same long 
timeframe as CLC. In addition, their long-term future is 


far from clear in that updates are only planned for Urban 
Atlas and Coastal Zones. PELCOM, GlobCorine and the 
Annual Land Cover Product are the European projects 
that most resemble the LUC maps available at global and 
supra-national scales for other parts of the world. They 
were obtained through classification of satellite imagery. 
PELCOM and GlobCorine are only available for a few 
dates and at quite coarse spatial resolutions: 1 km and 
300 m respectively. The Annual Land Cover Product 
consists of a series of LUC maps for the period 2000- 
2019 at a highly detailed spatial resolution (30 m). It 
offers information for a large number of different points in 
time. However, it makes a separate classification of land 
uses each year, which means that change analysis with 
this dataset is more uncertain than with CLC or other 
Copernicus Land Monitoring products. HILDA and 
S2GLC 2017 are LUC datasets produced within the 
framework of different research projects, which can be 
considered reference products in their respective fields. 
HILDA provides one of the largest time series of LUC 
maps currently available, spanning the period from 1900 
to 2010. S2GLC 2017 is one of the most spatially detailed 
LUC mapping experiences at a supra-national scale, with 
a spatial resolution of 10 m. 
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1 HILDA 


Website of reference 


Product 

LULC general 

Dates 

1900-2010 (every 10 years) 
Formats 

Raster 

Pixel size 

] km 

Thematic resolution 


5 classes plus water: 
1 (a), 1 (ag), 2 (v), 0 (m), 0 (na)! 


Compatible legends 
IPCC, LCCS 
Extent 


| European Union plus the UK and Switzerland 


Updating 

Not planned 

Change detection 

Yes 

Overall accuracy 

Not specified 

Website Language English 
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https://www.wur.nl/en/Research-Results/Chair-groups/Environmental-Sciences/Laboratory-of-Geo-information-Science-and-Remote-Sensing/ 


Models/Hilda.htm 


Download site 


https://www.wur.nl/en/Research-Results/Chair-groups/Environmental-Sciences/Laboratory-of-Geo-information-Science-and-Remote-Sensing/ 


Models/Hilda/HILDA -data-downloads.htm 
Availability 


Open Access, after providing 
personal data 


Technical documentation 
Fuchs et al. (2013,2015a, b) 
Other references of interest 
Fuchs (2015) 


! (a): artificial; (ag): agriculture; (v): vegetation; (m): mixed classes; 
(na): no data. 


Format(s) 
ESRI Grid, .tiff, .ascii 


General Land Use Cover Datasets for Europe 
Project 


Historic Land Dynamics Assessment (HILDA) is a project 
aimed at reconstructing historic land cover/use and LUC 
changes in Europe. Unlike other LUC reconstruction pro- 
jects and datasets, it allows us to study LUC changes over 
time. The recently launched HILDA + project takes the 
original project one step further by mapping historical LUC 
changes at a global scale for the period 1960-2019. 

The reconstruction. of historic LUC landscapes and 
changes is carried out using a model maintained and 
developed by the Department of Geoinformation Science 
and Remote Sensing of Wageningen University. The model 
allocates non-spatial historic LUC information on the 
ground. 


Production method 


Historic LUC maps for the HILDA project were obtained 
through an extensive workflow involving various steps. 
First, gross and net LUC changes per decade were obtained 
for the period 1950-2010 from a set of sources providing 
historic LUC information: UNFCCC national reporting data, 
CORINE Land Cover, Historisch Grondgebruik Nederland 
(HGN) for the Netherlands, FAO-RSS data and BioPress 
data with classified aerial photographs of 73 sample sites 
across Europe. Later, LUC data was spatially allocated by 
the HILDA model. Four categories were spatially allocated 
at this stage. A fifth category (other land) remained static 
throughout the time series. Water was a subclass of the 
"other land" category, which was only separated in the final 
maps for visualization purposes. 

The model allocates the LUC categories using a series of 
probability maps. A specific probability map for each cate- 
gory was created on the basis of historical LUC maps and a 
range of socioeconomic and physical (soil properties, cli- 
mate and terrain) factors. The categories were allocated 
hierarchically according to their socioeconomic value: set- 
tlements were allocated first, followed by croplands, forest 
and grasslands. 

Once the model had been run for the 1950-2010 time- 
frame, four extra maps were obtained for the period 1900— 
1950 based on historical LUC statistics and an extrapolation 
of the change matrix. The pre-1950 maps therefore assume 
stable transition rates for the period 1950-2010. This could 
be an important source of uncertainty in these maps. 
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Product description 


The product is delivered in four different packages, two of 
which include the series of LUC maps (1900-2010). Of 
these, one considers the net changes over the course of each 
decade, while the other considers the gross changes. The 
other two packages detail the specific transitions that take 
place between the different categories, one charting net 
changes and the other gross changes. 

Each package can be downloaded in three different file 
formats (ESRI Grid, TIFF, ASCII). Each download includes 
a raster with LUC information for each decade and a sup- 
plementary file with the technical description of the product. 


Downloads 
Gross land changes 


— Raster files with LUC maps for each decade 
— Text document with technical information and the legend 


Net land changes 


— Raster files with LUC maps for each decade 
— Text document with technical information and the legend 


Transitions maps (for gross and net) 


— Raster files with LUC maps for each decade 
— Text document with technical information and the legend 


Legend and codification 


HILDA gross and net maps 


Code Label Code Label 

111 Settlement 444 Grassland 
222 Cropland 555 Other land 
333 Forest 666 Water 


HILDA gross and net transitions maps (1900-2000) 


Code | Label Code | Label 

112 Cropland to settlement 242 Cropland to grassland 
113 Forest to settlement 252 Cropland to other land 
114 Grassland to settlement | 262 Cropland to water 


(continued) 
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HILDA gross and net transitions maps (1900-2000) 


Code 
115 


116 
121 
131 
141 
151 


161 


223 
224 


225 
226 
232 


Label 


Other land to 
settlement 


Water to settlement 
Settlement to cropland 
Settlement to forest 
Settlement to grassland 


Settlement to other 
land 


Settlement to water 


Forest to cropland 


Grassland to cropland 


Other land to cropland 
Water to cropland 
Cropland to forest 


Code 
334 


335 
336 
343 
353 
363 


445 


446 
454 


464 
556 
565 


Label 


Grassland to forest 


Other land to forest 
Water to forest 

Forest to grassland 
Forest to other land 


Forest to water 


Other land to 
grassland 


Water to grassland 


Grassland to other 
land 


Grassland to water 
Water to other land 
Other land to Water 


HILDA gross and net transitions maps (2000-2010) 


Code 
112 
113 
114 
115 


116 
121 
131 


Label 

Settlement to cropland 
Settlement to forest 
Settlement to grassland 


Settlement to other 
land 


Settlement to water 
Cropland to settlement 


Forest to settlement 


Code 
242 
252 
262 
334 


335 
336 
343 


Label 

Grassland to cropland 
Other land to cropland 
Water to cropland 


Forest to grassland 


Forest to other land 
Forest to water 


Grassland to forest 


(continued) 
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HILDA gross and net transitions maps (2000-2010) 


Code Label Code Label 
141 Grassland to settlement 353 Other land to forest 
151 Other land to 363 Water to forest 
settlement 
161 Water to settlement 445 Grassland to other 
land 
223 Cropland to forest 446 Grassland to water 


224 Cropland to grassland 454 Other land to 


grassland 
225 Cropland to other land 464 
226 Cropland to water 556 
232 Forest to cropland 565 


Water to grassland 
Other land to Water 
Water to other land 


Practical considerations 


This is a valuable dataset because of the rich historic LUC 
information it provides. There are very few long, dense 
historical series of LUC maps that measure LUC change 
over time. Nonetheless, users should be aware of the 
uncertainties associated with this dataset. The maps prior to 
1950 were created by extrapolating the patterns of change 
for the period 1950-2010. This could introduce a high 
degree of uncertainty. 

An online visualization of the maps for the years 1900 
and 2010 is available, together with other auxiliary infor- 
mation, at http://www.geo-informatie.nl/fuchs003/ft. 

To study global historical LUC change at a similar level 
of detail, users should refer to the associated HILDA+ 
project. 


General Land Use Cover Datasets for Europe 317 


2 CLC—CORINE Land Cover 


Product 

LULC general 

Dates 

1990, 2000, 2006, 2012, 2018 
Formats 

Vector and raster 

Scale/Pixel size 


Photointerpretation scale: 1:100,000 

Minimum Mapping Unit: 25 ha/5 ha for changes 
Minimum Mapping Width: 100 m 

Pixel size (raster): 100 m 


Thematic resolution 


44 classes: 


11 (a), 8 (ag), 8 (v), 6 (m), 3 (na) 
Compatible legends 

CLC 

Extent 


Europe, with an increasing number of countries taking part in the 
project each year (39 in CLC18) 


Updating 
Scheduled updates every 6 years 


Change detection 

Yes, through the layer of changes 
Overall accuracy 

Expected to be >85% 

Website of reference Website Language English, German and French 
https://land.copernicus.eu/pan-european/corine-land-cover 

Download site 

https://land.copernicus.eu/pan-european/corine-land-cover 

Availability Format(s) 

Open Access previous registration tiff, .gdb, .gpkg 

Technical documentation 


Bossard et al. (2000), Biittner et al. (2002, 2011, 2012, 2014) European Environment Agency (1994, 2006a, b, 2007), Jaffrain et al (2017), 
Kosztra et al. (2019), Soukup et al. (2017) 


Other references of interest 


Bach et al. (2006), Bielecka and Jenerowicz (2019), Büttner (2014), European Environment Agency (2006c), Feranec et al. (2010, 2016), 
Gallego (2001), Garcia-Alvarez and Camacho Olmedo (2017), Neumann et al. (2007) 
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Project 


CORINE Land Cover (CLC) is a European project moni- 
toring Land Use and Cover that dates back to 1985. It aims 
to map land uses and land covers across the whole continent 
according to the same rules. It is currently part of the land 
monitoring efforts of the Copernicus programme. 

The number of countries taking part in the project has 
been increasing since its inception, from the initial group of 
26 countries that created the CLC 1990 to the 39 countries 
that participated in the most recent edition”. In the meantime, 
the production of CLC has undergone several technical and 
methodological changes. The fact that CLC is produced at a 
national level means that methods vary from one country to 
the next. 

Because of its long life, detail, consistency and wide 
range of applications, CLC is one of the most renowned 
LUC mapping initiatives worldwide. Various European 
countries have developed national LUC products based on 
CLC. In some cases, these products are new CLC layers with 
an extended legend, adapted to the specificities of the 
country. In other cases, they are new CLC layers for different 
dates to those used in the main Europe-wide project. 


Production method 


The production of CLC is coordinated by the European 
Environment Agency (EEA). Each participant country is 
responsible for mapping its own territory according to the 
general guidelines developed by the EEA. 

The method of production may vary from country to 
country. Initially, CLC was mapped at national scales based 
on the photointerpretation of Landsat imagery. In the fol- 
lowing editions, most of the countries decided to stick to this 
method, using different satellite imagery according to EEA 
prescriptions: Landsat, SPOT; ITS P6, RapidEye, LISS III, 
Sentinel. In the latest editions, the production method has 
varied in some cases. A few countries, like Germany or 
Spain, produce the CLC database by generalizing national 
LUC databases at finer scales. This has introduced important 
changes in the way land uses and covers are mapped over 
time for these countries. For both production methods, 
photointerpretation and map generalization, the CLC map 
obtained is then subject to expert review to ensure its con- 
sistency and validity. 

The first CLC map was produced for the reference year 
1990 and the subsequent editions have been updates of this 
initial map. The national teams do not draw a new map for each 
new reference year. Instead, they map the changes for the 
analysed period (e.g. 1990—2000) and then update the base 


? https://land.copernicus.eu/pan-european/corine-land-cover. 
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map for the new reference year. In this updating process, any 
errors detected in the base map are also corrected. If important 
changes have been made in the CLC production method, the 
base map is also updated according to the new method. 

In addition to the maps for each reference year, CLC 
produces change layers for each period between reference 
years: 1990—2000, 2000-2006, 2006-2012, 2012-2018. The 
maps showing changes do not follow the same mapping 
rules as the base CLC maps and show more information than 
the base layers for the reference years (MMU of 5ha). 
The CLC production team therefore recommends that LUC 
changes be studied using these change layers, rather than by 
cross-tabulating and comparing base CLC maps. 


Product description 


CLC is made up of two spatial layers: a Land Use Cover 
map for each reference year (1990, 2000, 2006, 2012, 2018) 
and a layer of Land Use Cover changes for each analysis 
period (1990—2000, 2000-2006, 2006-2012, 2012-2018). 
The reference map for each year provides Land Use Cover 
information for the total area of the participant countries. 
The map of changes only accounts for the changes that took 
place in the period under consideration. Rather than com- 
paring two reference maps, the CLC layer of changes maps 
all changes bigger than 5ha and discards all technical 
changes that did not take place on the ground. 

CLC layers are provided in either vector (ESRI or 
GeoPackage databases) or raster (.tiff) formats. As might be 
expected, the vector data is much heavier than the raster 
data, because of its higher definition. 

Together with the LUC layers, the CLC product includes 
all the auxiliary information required to understand the LUC 
information provided by the CLC layers: a style layer for the 
raster, the legend description, technical information and 
other relevant metadata. LUC maps for the French overseas 
departments (Guadeloupe, French Guinea, Martinique, 
Mayotte and Reunion) are also provided in auxiliary layers. 


Downloads 


The base layers with LUC maps for each reference year 
(CLC) have the same structure and group of files, as do the 
change layers for each period of analysis (CHA). This is why 
we only describe the file structure once for each type of 
format. 


CLC 2018 (Geodatabase)/CHA 2012-2018 (Geodatabase) 


— Geodatabase files with CLC vector layers (DATA folder) 

— Folder with CLC vector data for French overseas departments 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 


(continued) 
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CLC 2018 (Geodatabase)/CHA 2012-2018 (Geodatabase) 


— Excel presenting the CLC legend, including information about the 
RGB colours for each class (Legend folder) 

— Text documents describing the CLC legend, including information 
about the RGB colours for each class (Legend folder) 

— Folder with metadata files (.xml) 

— PDF and Excel sheet with information about CLC country coverage 
(Documents folder) 

— A Word document explaining how to use the CLC files for the 
product in QGIS (Documents folder) 

— Three text documents with technical information about the CLC 
layers (Documents folder) 


CLC 2018 (GeoPackage)/CHA 2012-2018 (GeoPackage) 


— GeoPackage file with CLC vector layers (DATA folder) 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 

— Excel presenting the CLC legend, including information about the 
RGB colours for each class (Legend folder) 

— Text documents describing the CLC legend, including information 
about the RGB colours for each class (Legend folder) 

— Folder with metadata files (.xml) 

— PDF and Excel sheet with information about CLC country coverage 
(Documents folder) 


(continued) 


Database 


CLC 2018 


OBJECTID Code 18 Remark Area Ha 


ID 


1 111 130,86365376999143 EU 1 
2 111 53,7445236800477 EU 2 
3 111 30,719103909971782 EU 3 


— OBJECTID: Unique identifier for each polygon. 

— Code 18: LUC code for the year 2018. 

— Remark 

— Area Ha: Area of the polygon, in hectares. 

— ID: Unique identifier for each polygon. 

— Shape Length: Perimeter of the polygon, in metres. 
— Shape Area: Area of the polygon, in square metres. 


— C18: LUC code for the year 2018. 
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CLC 2018 (GeoPackage)/CHA 2012-2018 (GeoPackage) 


— A Word document explaining how to use the CLC files for the 
product in QGIS (Documents folder) 

— PDFs and text documents with technical information about the CLC 
layers (Documents folder) 


CLC 2018 (RasterCHA 2012-2018 (Raster) 


— Raster file with CLC map (DATA folder) 

— Folder with CLC raster data for French Overseas Departments 
(DATA folder) 

— Layer style files for ArcGIS (.lyr) and QGIS (.qml) (Legend folder) 

— Layer style files for ArcGIS (.lyr) and QGIS (.qml) for French 
Overseas Departments (French DOMs folder) 

— Text document describing the CLC legend, including information 
about the RGB colours for each class (Legend folder) 

— Folder with metadata files (.xml) 

— PDF and Excel sheet with information about CLC country coverage 
(Documents folder) 

— A Word document explaining how to use the CLC files for the 
product in QGIS (Documents folder) 

— PDFs and text documents with technical information about the CLC 
layers (Documents folder) 


Shape Length Shape Area c18 
10902,412470996784  1308636,5376999143 111 
6329,456389055609 537445,236800477 111 
3371,7747186580314 . 307191,0390997178 111 
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CHA 2012-2018 


OBJECTID Change ID Code 12 Code 18 Chtype Remark 


1 111-112 EU-1 111 112 R 
2 111-112 EU-2 111 112 R 
3 111-112 EU-3 111 112 R 


— OBJECTID: Unique identifier for each polygon 
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Area Ha Shape Length Shape Area 
8,049594709980532 1774,991620734559 80495,94709980533 
18,185012579975673  2282,695093131555 181850,12579975673 


12,21255290507005 2012,3640670179973 122125,5290507005 


— Change: Change code made up of the CLC code for the oldest year (on the right) and the CLC code for the most recent year on the left (2018) 


— ID: 

— Code 12: LUC code for the year 2012 

— Code 18: LUC code for the year 2018 

— Chtype 

— Remark 

— AREA HA: Area of the polygon, in hectares 

— Shape Length: Perimeter of the polygon, in metres 


— Shape Area: Area of the polygon, in square metres 


Legend and codification 


Code Label Code Label 
111 Continuous urban 313 Mixed forest 
fabric 
112 Discontinuous 321 Natural grasslands 
urban fabric 
121 Industrial or 322 Moors and 
commercial units heathland 
122 Road and rail 323 Sclerophyllous 
networks and vegetation 
associated land 
123 Port areas 324 Transitional 
woodland-shrub 
124 Airports 331 Beaches, dunes, 
sands 
131 Mineral extraction 332 Bare rocks 
sites 
132 Dump sites 333 Sparsely 
vegetated areas 
133 Construction sites 334 Burnt areas 
141 Green urban areas 335 Glaciers and 
perpetual snow 
142 Sport and leisure 411 Inland marshes 
facilities 
211 Non-irrigated land 412 Peat bogs 
213 Rice fields 421 Salt marshes 
221 Vineyards 422 Salines 
222 Fruit trees and 423 Intertidal flats 
berry 
plantations 
223 Olive groves 511 Water courses 
231 Pastures 512 Water bodies 
241 Annual crops 521 Coastal lagoons 
associated with 
permanent crops 
242 Complex 522 Estuaries 
cultivation patterns 
(continued) 


Code Label Code Label 
243 Land principally 523 Sea and ocean 
occupied 
by agriculture, 
with significant 
areas of natural 
vegetation 
244 Agro-forestry areas 999 NO DATA 
311 Broad-leaved 990 UNCLASSIFIED 
forest LAND 
SURFACE 
312 Coniferous forest 995 UNCLASSIFIED 
WATER BODIES 


Practical considerations 


CLC was originally mapped in vector format. This format 
provides higher precision and detail and is therefore recom- 
mended when working at local and regional scales. At national 
and supranational scales, raster data can be more suitable, as 
vector data is too heavy and may be difficult to handle in 
desktop computers with insufficient processing power. 

Users can download the vector CLC to rasterize the 
database to the spatial resolution they require. The 100 m 
offered is the reference resolution provided by the EEA, but 
it is not the only one at which the map could be used. 

Users should be aware that different mapping method- 
ologies were used in different countries, and in some coun- 
tries, at different times. This could result in significant 
differences in the way the landscape is mapped and con- 
ceptualised, which could introduce important sources of 
uncertainty in our studies and analyses. The same category 
could be interpreted differently in different countries, and 
even within the same country, a particular category could be 
mapped differently at different times if the production 
method changes. Those wishing to analyse LUC change 
should therefore use the change layers rather than the maps. 
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3 PELCOM—Pan-European Land Use and Land Cover Monitoring 


Product 
LULC general 
Dates 

1997 


Formats 
Raster 


Pixel size 

] km 

Thematic resolution 
16 classes: 

1 (a), 3 (ag), 5 (v), 1 (m), 2 (na) 
Compatible legends 
No 

Extent 

Europe 

Updating 

No 


Change detection 


No (only one date) 
Overall accuracy 
Expected to be >69% 

Website of reference Website Language English 

http://www.geo-informatie.nl/projects/pelcom/ 

Download site 

http://www.geo-informatie.nl/projects/pelcom/public/index.htm 

Availability Format(s) 

Open Access „tiff 

Technical documentation 

Champeaux et al. (2000), Mücher (2000), Mücher et al. (2000) 


Other references of interest 
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Project 


PELCOM (Pan-European Land Cover Monitoring) was a 
research project funded by the European Union that ran from 
1996 to 1999. The main purpose of the project was to 
develop a consistent methodology to create a continental 
LUC map for Europe from remote sensing sources. Users 
were consulted about their needs and requirements and 
revealed that they would like to have LUC data at coarser 
and finer spatial resolutions than CLC, and that CLC could 
be updated more frequently. They also made clear that a 
dataset of this kind would be useful for environmental 
modelling and monitoring purposes. 

At the time the project was launched, no consistent 
continental LUC maps were available at high spatial reso- 
lution (at least 1 km). The map created through the project 
sought to provide a high-resolution continental LUC dataset 
that could later be updated frequently. However, despite 
these original intentions, the PELCOM map has not been 
updated since the project came to an end. 


Production method 


The classification carried out for the PELCOM map was 
based on AVHRR imagery and NDVI composites from the 
DLR archive of the JRC. An improved stratified, integrated 
classification methodology was specifically developed by the 
creators of this map. To this end, Europe was divided into 
different strata according to similarities in LULC patterns 
and phenology. 

The classification process consisted of several steps, in 
which users played an important role. Both supervised and 
unsupervised classifiers were employed. Some classes (for- 
est, water bodies, urban areas) were mapped through specific 
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workflows, using masks and other strategies, to improve the 
uncertainty and errors associated with their classification. 


Product description 


PELCOM may be downloaded in three different formats: 
ESRI-grid, ERDAS-Image and ENVI. The download 
includes the raster with the LUC map and, depending on the 
format chosen, auxiliary information about the product 
(readme and symbology files). 

Detailed technical documentation about the map and its 
production method is also available from the download site. 


Downloads 


PELCOM ESRI-grid 


— Raster file with LUC map 

— Preview image of the product 

— Readme file with information about the product (.doc) 
— File with raster symbology for ArcGIS (.avl) 


Legend and codification 


Code Label Code Label 

11 Coniferous forest 60 Barren land 

12 Deciduous forest 70 Permanent Ice & Snow 
13 Mixed forest 80 Wetlands 

20 Grassland 91 Inland waters 

31 Rainfed arable land 92 Sea 

32 Irrigated arable land 100 Urban areas 

40 Permanent crops 110 Data gaps 

50 Shrubland 111 Out of scope 
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4 Annual Land Cover Product 


Website of reference 


https://medium.com/swlh/europe-from-above-space-time-machine-learning-reveals-our-changing-environment-1b05cb7be520 


Download site 
https://maps.opendatascience.eu/ 
Availability 

Open Access 

Technical documentation 

Not published yet 


Other references of interest 


Product 
LULC general 
Dates 
2000-2019 
Formats 
Raster 

Pixel size 


30m 


Thematic resolution 


33 classes: 
8 (a), 7 (ag), 7 (v), 1 (m), 0 (na) 


Compatible legends 
LUCAS, CLC 
Extent 

Europe 

Updating 

Not planned 

Change detection 
Not recommended 
Overall accuracy 
Evaluation in process 


Website Language English 


Format(s) 
tiff 
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324 
Project 


An open annual land cover dataset for Europe has been 
produced in the context of the “Geo-harmonizer: EU-wide 
automated mapping system for harmonization of Open Data 
based on FOSS4G and Machine Learning", a project coor- 
dinated by the Czech Technical University in Prague. This 
project is part of the Connecting Europe Facility (CEF) in 
Telecom, which aims to deploy digital service infrastructures 
(DSIs) that can facilitate cross-border interaction between 
public administrations, businesses and citizens. 

The Geo-harmonizer project has developed a web-based 
system (Open Data Science Europe) that hosts open European 
thematic geospatial layers, including one on land cover. They 
were specifically created for the project from other data 
sources for the period 2000-2020 using modelling techniques. 
These harmonized European layers overcome the limitations 
resulting from the use of national datasets that were created 
with different parameters and have different characteristics. 

Apart from a layer on land cover, Open Data Science 
Europe hosts data on subjects such as the environment, 
terrain, clime, soils or vegetation. These data are comple- 
mentary to the datasets provided by the Copernicus Land 
Monitoring Service, also at continental level. 

The project has the same values and approach as other 
Open Science projects in the geospatial field, such as Open 
Land Map and Open Street Map. 


Production method 


Open Data Science Europe's Annual Land Cover Product is 
obtained by producing a series of probability layers for each 
of the 33 LUC categories that were mapped. The land cover 
with the highest probability for each year and pixel 
according to these layers was the one finally selected to 
create the general LUC maps. 
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Probability layers were obtained through a set of three 
Machine Learning (ML) models: Random Forest, XGBoost and 
Artificial Neural Network. The models were trained with ref- 
erence data obtained from CLC and LUCAS and input Landsat 
imagery (LANDSAT ARD), night lights data (VIIRS/SUOMI 
NPP), Global surface water frequency and an EU DTM. 

A final probability layer for each LUC category was 
obtained after running a Logistic regression classifier on the 
results of three ML models. The uncertainty of the proba- 
bility layers for each LUC category was also calculated as 
the standard deviation of the three predicted probabilities 
from the ML models. 


Product description 


The dataset can be individually downloaded for each avail- 
able year of the period 2000-2019 from the Open Data 
Science Europe viewer. The download contains the raster 
file with the LUC information, but offers no other auxiliary 
data. Nonetheless, a layer style file to symbolize the dataset 
in QGIS? can be downloaded separately. 


Downloads 


Annual Land Cover Product 2019 
— Raster file with LUC map (.tiff) 


Legend and codification 


Code Label Code Label 
111 Urban fabric 321 Natural 
grasslands 
122 Road and rail 322 Moors and 
networks and heathland 
associated land 
(continued) 


3 http://s3.eu-central-1.wasabisys.com/eumap/Icv/Icv_landcover.hcl_ 
lucas.corine.rf p 30m 0..0cm 2000 eumap epsg3035 v0.1.qml. 
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Code 
123 


124 


131 


132 
133 


141 
211 


212 


213 


221 
222 


223 
231 
311 
312 


Label 


Port areas 


Airports 


Mineral extraction 
sites 


Dump sites 


Construction sites 


Green urban areas 


Non-irrigated arable 
land 


Permanently irrigated 
arable land 


Rice fields 


Vineyards 


Fruit trees and berry 
plantations 


Olive groves 
Pastures 
Broad-leaved forest 


Coniferous forest 


Code 
323 


324 


331 


332 
333 


334 
335 


411 


421 


511 
512 


521 
522 
523 


Label 


Sclerophyllous 
vegetation 


Transitional 
woodland-shrub 


Breaches, 
dunes, sands 


Bare rocks 


Sparsely 
vegetated areas 


Burnt areas 


Glaciers and 
perpetual snow 


Inland wetlands 


Maritime 
wetlands 


Water courses 


Water bodies 


Coastal lagoons 
Estuaries 


Sea and ocean 
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Practical considerations 


The dataset is currently only available for download at the 
Opendatascience website. However, it will soon be uploaded 
to public repositories, where users will be able to access all 
data from the project, including layers of uncertainty. 
Information about the dataset production procedure will also 
be published in the coming months together with other rel- 
evant information. 

The dataset can also be accessed through a WES* and a 
file service (Cloud-Optimized GeoTIFFs)? in QGIS or other 
common GIS software. The map producers also provide 
information about how to access the data through GDAL, R 
and Python. This can be found by clicking on the About tab 
in the Opendatascience website. 


^ https://geoserver.opendatascience.eu/geoserver/wfs. 
? http://s3.eu-central-1.wasabisys.com/eumap/Icv/Icv_landcover.hcl_ 
lucas.corine.rf p 30m 0..0cm 2019 eumap epsg3035 vO0.I.tif. 
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5 GlobCorine 


Website of reference 


http://due.esrin.esa.int/page_project114.php 


Download site 
https://maps.elie.ucl.ac.be/geoportail/ 
Availability 

Open Access (only 2005 map) 


Technical documentation 


Bontemps et al. (2010), Defourny et al. (2010a b, c) 


Other references of interest 
Bontemps et al. (2009) 


Product 

LULC general 
Dates 

2005, 2009 
Formats 

Raster 

Pixel size 

300 m 

Thematic resolution 


17 classes: 
1 (a), 3 (ag), 7 (v), 4 (m), 1 (na) 


Compatible legends 

CLC—FAO LCCS 

Extent 

Continental (Europe and surroundings) 
Updating 

No 

Change detection 

Not recommended 


Overall accuracy 
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>48% or >79% depending on the validation dataset considered 


Website Language English 


Format(s) 
tiff 


General Land Use Cover Datasets for Europe 
Project 


Based on earlier efforts in GlobCover, the ESA launched the 
GlobCorine project in collaboration with the European 
Environment Agency (EEA) and the Université Catholique 
de Louvain (UCL). The aim was to create a new LUC 
product for the European continent that was compatible with 
the Corine Land Cover (CLC) classification and built on the 
work already carried out as part of the GlobCover project. 


Production method 


GlobCorine was produced by classifying the same MERIS 
imagery used for GlobCover. The same production method 
was used in the two LUC maps available and was similar to 
the one already used for GlobCover. It consisted of a series 
of supervised and unsupervised classification routines to 
identify spectro-temporal classes. These were later auto- 
matically labelled with the information provided by auxiliary 
datasets, mainly Corine Land Cover (CLC) and GlobCover. 
For classification purposes, the world was divided into dif- 
ferent regions according to their ecological and reflectance 
characteristics. 

An extra classification was carried out for mixed cate- 
gories. The final LUC maps were then corrected and 
improved in a post-classification phase with the help of 
auxiliary data and expert knowledge. 


Product description 


Only one of the two GlobCorine maps is currently available 
for download: the map for the reference year 2005. The 
download includes the raster with the LUC map, the legend, 
a file to symbolize it in GIS software and all relevant tech- 
nical information explaining the characteristics of the 
dataset. 


Downloads 


GlobCorine 2005 


— Raster file with LUC map (GLOBCORINE LC) 

— Preview image of the product (GLOBCORINE LC) 

— Layer style files for ArcGIS (.lyr) and ENVI (.dsr) 
(GLOBCORINE LC) 


(continued) 
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GlobCorine 2005 


— Excel sheet with the map legend (*GlobCorine legend") 
(GLOBCORINE LC) 

— PDFs with technical information about the product (Documentation) 

— PDF with a description of the downloaded product (README) 


Legend and codification 


Code Label Code Label 
10 Urban areas and 100 Complex cropland 
associated areas 
20 Rainfed cropland 110 Mosaic cropland (50— 
80906) / natural 
vegetation (20-50%) 
30 Irrigated cropland 120 Mosaic natural 
vegetation (50-80%) / 
cropland (20-50%) 
40 Forest 130 Mosaic herbaceous (50- 
80%) / shrub-trees (20- 
50%) 
50 Shrubland 140 Mosaic shrub-trees (50- 
80%) / herbaceous (20- 
5096) 
60 Grassland 200 Water bodies 
70 Sparsely vegetated 210 Permanent snow and ice 
areas 
80 Vegetated low-lying 220 No Data 
areas on regularly 
flooded soil 
90 Bare areas 


Practical considerations 


The product is no longer available for download from the 
official website of the EEA. The only edition that can still be 
obtained is the map for 2005, which is available through the 
geoportal of the Université Catholique de Louvain, one of 
the producers of the dataset. The map can also be consulted 
online at the same website, without having to download it. 

While the GlobCorine classification legend focuses par- 
ticularly on land use, GlobCover centres on land cover. 
GlobCorine can therefore be regarded as a complementary 
dataset to GlobCover. 


328 


6 Urban Atlas 


Website of reference 


D. García-Álvarez et al. 


Product 


LULC general 


Dates 


2006, 2012, 2018 


Formats 


Vector 


Scale 


Photointerpretation scale: 1:10,000 

Minimum Mapping Unit: 0.25ha in urban areas and [ha in rural areas 
O.1ha for urban changes and 0.25ha for rural/natural changes 
Minimum Mapping Width: 10 m 


Thematic resolution 


29 classes: 
17 (a), 4 (ag), 2 (v), 2 (m), 2 (na) 


Compatible legends 


CLC 


Extent 


Europe (39 countries) 
Updating 


Every 6 years 


Change detection 
Through map of changes 


Overall accuracy 


Expected to be > 80% 
Website Language English, German and French 


https://land.copernicus.eu/local/urban-atlas 


Download site 


https://land.copernicus.eu/local/urban-atlas 


Availability 


Format(s) 


Open Access under registration 


-gpkg 


Technical documentation 


Copernicus Programme (2020), Gallaun (2017), Hirschmugl et al. (2018), Silva et al. (2013, 2016) 


Other references of interest 


Barranco et al. (2014), European Commission and OECD (2012), Jaffrain et al. (2016), Montero et al. (2014), Petrisor and Petrisor (2015), 


Prastacos et al. (2011), Seifert (2009) 


General Land Use Cover Datasets for Europe 


Project 


Urban Atlas is part of the Copernicus programme and pro- 
vides very detailed LUC information for Functional Urban 
Areas (FUA) in Europe. A Functional Urban Area (as 
defined by the European Commission and the OECD) is an 
urban space that joins the core areas of cities with their 
surrounding commuter belts. 

The Urban Atlas aims to contribute to the study of urban 
areas and their dynamics, in line with the needs of the 
European Commission and other European initiatives, such 
as ESPON and INTERREG. It therefore has a clear goal to 
inform policy-making. 

Three editions of the Urban Atlas have so far been pub- 
lished, with more FUAs participating in each one. 319 FUAs 
were mapped for reference year 2006, 785 for 2012 and 788 
for 2018. New updates of the Urban Atlas are expected 
every 6 years. 

For each edition, a detailed LUC map of the FUAs is 
provided, as well as a map of the changes that have taken 
place over the period under consideration (e.g. 2006-2012). 
A Street Tree Layer map is also provided for the 2012 and 
2018 editions. The 2012 Urban Atlas includes a building 
height map for core areas (not FUA) of European capitals in 
the EEA39. Polygons of the 2012 Urban Atlas also include 
population estimates. 


Production method 


Urban Atlas is obtained through automatic classification and 
manual photointerpretation of high-resolution satellite ima- 
gery: the optical VHR coverage of the Copernicus pro- 
gramme, at a spatial resolution of 2-4 m. 

First, the imagery is automatically segmented and classi- 
fied, differentiating between basic land cover classes. Later, 
the detailed interpretation of land cover classes is carried out 
visually. A range of auxiliary data are applied in this process: 
topographic maps, the High Resolution Layer for impervious 
surfaces, road networks from COTS (Commercial 
Off-The-Shelf) navigation data and OSM as well as other 
data sources depending on the class under consideration (e.g. 
Google Earth, local city maps, cadastral data or very high 
resolution imagery, at a spatial resolution of up to | m). 

Change detection for each period is carried out indepen- 
dently, based on the Urban Atlas map for the previous year and 
a combination of both automatic and manual approaches for 
change detection. In the change detection process, misclassi- 
fications for the previous year of reference are corrected. 
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There are certain exceptions to the Minimum Mapping 
Units and Minimum Mapping Widths, depending on the 
characteristics and pattern of the class being anal- 
ysed. However, no features are mapped below the 0.5ha 
threshold. 


Product description 


Urban Atlas is distributed in single files for each FUA. There 
is no single common file that hosts all the FUAs together. 
A different file must be downloaded for each year and for 
each available change layer. 

Downloads include the vector layers in Geopackage 
format with the LUC information, the boundaries of the 
FUAs and their urban cores, a metadata file and layer style 
files to symbolize the vector layers in GIS. 

For reference years 2012 and 2018, the Street Tree Layer 
can be downloaded for each FUA. This layer represents 
contiguous rows or patches of trees covering at least 0.5ha. 
For the reference year 2012, a building height model in 
raster format can also be downloaded. 


Downloads 


Urban Atlas 2018 (Madrid) 


— GeoPackage file with Urban Atlas vector layers: Urban Atlas 2018, 
Urban Core and Boundary (DATA folder) 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 

— Metadata file (.xml) (Metadata folder) 


Urban Atlas changes 2012-2018 (Madrid) 


— GeoPackage file with Urban Atlas Change vector layers: Urban 
Atlas Change 2012-2018, Urban Core and Boundary (DATA 
folder) 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 

— Metadata file (.xml) (Metadata folder) 


Street tree layer 2012 — STL (Madrid) 


— Vector file with STL layer 
— Vector file with FUA boundary 


Building height 2012 (Madrid) 


— Raster file with building heights (DATA folder) 
— PDFs with technical information about the product (DOC folder) 
— Metadata file (.xml) (Metadata folder) 
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Database 


Urban area 2018 (Madrid) 


fid country fua name fua code code 2018 class 2018 prod date identifier perimeter area comment 
1 1 ES Madrid ESOO1L3 11210 Discontinuous .. 2020-02 20215-ESOO1L3  288,108889117289 4195,51178273653 
2 2 tS Madrid ES001L3 11220 Discontinuous... 2020-02 34093-ESOOLL3 338,42589076045 7376,70547635699 
3 3 ES Madrid ESOO1L3 11240 Discontinuous... 2020-02 43827-ESOO1L3  266,701707505638 4077,81816957153 


— FID: Unique identifier for each polygon 

— Country: Country code 

— FUA name: Name of the Functional Urban Area 
— FUA code: Code for the Functional Urban Area 
— Code 2018: LUC code for the year 2018 

— Class 2018: LUC description for the year 2018 
— Prod date: Map production year 

— Identifier: Unique identifier for each polygon 

— Perimeter: Perimeter of the polygon, in metres 

— Area: Area of the polygon, in square metres 

— Comment: Extra field for additional comments about the mapped features 


Urban area change 2012-2018 (Madrid) 


fid country fua name fua code code 2018 class 2018 prod date identifier perimeter area comment code 2012 class 2012 
1 3334 ES Madrid ES001L3 14100 Green urban ar... 2020-02 84414-ESO01L3 — 509,613513775099 16445,1738405536 23000 Pastures 
2 3333 ES Madrid ES00113 13400 Land without c. 2020-02 84413-ESO01L3 — 545,957445830923 11287,5073455729 23000 Pastures 
3 3332 ES Madrid ESo0113 11230 Discontinuous L.. 2020-02 64723-ESOO1L3 — 216,313014959852 2768,58596393346 13300 Construction si... 


— FID: Unique identifier for each polygon 

— Country: A two-letter code to identify each country 
— FUA name: Name of the Functional Urban Area 

— FUA code: Code for the Functional Urban Area 

— Code 2018: LUC code for the year 2018 

— Class 2018: LUC description for the year 2012 

— Prod date: Map production year 

— Identifier: Unique identifier for each polygon 

— Perimeter: Perimeter of the polygon, in metres 

— Area: Area of the polygon, in square metres 

— Comment: Extra field for additional comments about the mapped features 
— Code 2012: LUC code for the year 2012 

— Class 2012: LUC description for the year 2012 


Street tree layer 2012—STL (Madrid) 


COUNTRY CITIES FUA_OR_CIT STL Shape_Leng Shape Area 
1 ES Madrid ES001L2 1 236,03710343800 2115,64219083000 
2 ES Madrid ESOO1L2 1 135,50042612100 679,89521552300 
3 ES Madrid ES001L2 1 9447833948390 565,27038061200 


— COUNTRY: A two-letter code for each different country 
— CITIES: Name of the Functional Urban Area 

— FUA OR CIT: Code of the Functional Urban Area 

— STL: Street Tree Layer code 

— Shape Leng: Perimeter of the polygon, in metres 

— Shape Area: Area of the polygon, in square metres 


General Land Use Cover Datasets for Europe 


Legend and codification 


Urban Atlas 


Code 
11100 


11210 


11220 


11230 


11240 


11300 


12100 


12210 


12220 


12230 


Label 


Continuous urban 
fabric (S.L. > 80%) 


Discontinuous dense 
urban fabric (S.L. 50- 
8096) 


Discontinuous 
medium-density urban 
fabric (S.L. 30-50%) 


Discontinuous 
low-density urban 
fabric (S.L. 10-30%) 


Discontinuous very 
low-density urban 
fabric (S.L. « 10%) 


Isolated structures 


Industrial, commercial, 
public, military and 
private units 


Fast transit roads and 
associated land 


Other roads and 
associated land 


Railways and 
associated land 


Code 
14100 


14200 


21000 


22000 


23000 


24000 


25000 


31000 


32000 


33000 


Label 


Green urban 
areas 


Sports and 
leisure 
facilities 


Arable land 
(annual crops) 


Permanent 
crops 


Pastures 


Complex and 
mixed 
cultivation 


Orchards 


Forests 


Herbaceous 
vegetation 
associations 


Open spaces 
with little or no 
vegetation 


(continued) 


Urban Atlas 
Code 

12300 
12400 
13100 


13300 


13400 


Label 
Port areas 
Airports 


Mineral extraction and 
dump sites 


Construction sites 


Land without current 
use 


Street tree layer 


Code 
1 


Practical considerations 


Code 

40000 
50000 
91000 


92000 
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Label 
Wetlands 
Water 


No data 
(Clouds and 
shadows) 


No data 
(Missing 
imagery) 


Land cover 


Tree cover 


LUC change must be analysed using the change layer. 
Comparing Urban Atlases for different years of reference 
will highlight many technical changes that did not actually 
happen on the ground. 

The Urban Atlas product can be also consulted online at 
the download webpage. 
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7 N2K—Natura 2000 


Website of reference 
http://land.copernicus.eu/local/natura 
Download site 
http://land.copernicus.eu/local/natura 
Availability 

Open Access under registration 
Technical documentation 

Buck and Biischer (2018) 


Other references of interest 


Product 

LULC general 
Dates 

2006, 2012, 2018 
Formats 

Vector 

Scale 


Photointerpretation scale: 1:5,000—1:10,000 
Minimum Mapping Unit: 0.5 ha 
Minimum Mapping Width: 10 m 


Thematic resolution 


48 classes: 


8 (a), 6 (ag), 13 (v), 9 (m), 0 (na) 
Compatible legends 


D. García-Álvarez et al. 


Urban Atlas, Riparian Zones, Coastal Zone product 


Extent 

Europe (29 countries) 
Updating 

Not planned 

Change detection 
Yes 

Overall accuracy 
Expected to be >80% 


Website Language English, German and French 


Format(s) 


.gdb, .gpkg 


General Land Use Cover Datasets for Europe 


Project 


N2K was developed as part of the Copernicus Land Moni- 
toring programme. It maps land uses and covers in the areas 
that form part of the Natura 2000 network, plus a 2 km 
buffer zone around their perimeters. Natura 2000 is a net- 
work that protects natural areas with rare and threatened 
species or with rare types of natural habitat. 

The dataset first appeared in 2015. A reviewed edition 
was issued in 2017 with a new classification legend that 
made it compatible with other European local reference LUC 
datasets: Riparian Zones, N2K and the Coastal Zone 
product. 


Production method 


N2K is obtained by photointerpretation of high-resolution 
imagery. Various auxiliary datasets are used in the pho- 
tointerpretation process, namely CORINE Land Cover, 
Urban Atlas, High Resolution Layers, topographic maps, 
national WMS services and COTS navigation data. The 
changes are also photointerpreted by comparing satellite 
images at two different points in time. 


Database 


Product: N2K 2000 


OBJECTID ID UID 
1 1 4114 4114 1 
2 2 4114 4114 2 
3 3 4114 4114 3 
— OBJECTID: Unique identifier for each polygon 
- ID 
— UID 
— SITECODE 
— GRASSTYPE 


— MAES 1 12: MAES class Level 1 for 2012 

— MAES 2 12: MAES class Level 2 for 2012 

— MAES 3 12: MAES class Level 3 for 2012 

— MAES 4 12: MAES class Level 4 for 2012 

— COMMENT 12: Comments on the 2012 mapping 
— NODATA 12: Objects with no data in 2012 

— MAES 1. 06: MAES class Level 1 for 2006 

— MAES 2. 06: MAES class Level 2 for 2006 

— MAES 3 06: MAES class Level 3 for 2006 

— MAES 4 06: MAES class Level 4 for 2006 

— COMMENT. 06: Comments on the 2006 mapping 
— NODATA. 06: Objects with no data in 2006 

— CHANGECODE: 2006-2012 changes 

— AREA HA: Area of the polygon, in hectares 

— ID: unique identifier for each polygon 
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Product description 


N2K is distributed as a single vector file covering all mapped 
Nature 2000 areas. Two formats are available: ESRI Geo- 
database and Geopackage. Downloads include the layers 
with LUC information, a style file to symbolize the layers in 
GIS and a pdf with the product classification scheme. 


Downloads 


N2K 2012 (Geodatabase) 


— Geodatabase files with N2K vector layers 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 

— Metadata file (.xml) (Metadata folder) 

— PDF with nomenclature guidelines 


N2K 2012 (GeoPackage) 


— GeoPackage files with N2K vector layers 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Legend folder) 

— Metadata file (.xml) (Metadata folder) 

— PDF with nomenclature guidelines 


SITECODE GRASSTYPE MAES 1 12 
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Legend and codification 


N2K was produced according to a hierarchical classification 
legend made up of four different levels, the most detailed of 
which is provided here (MAES L3). Information about the 
other levels of classification and their codes can be found in 
the technical documents accompanying the dataset. 


Code Label Code Label 

111 Urban fabric 4211 Semi-natural 
(predominantly grassland with 
public and private woody plants (C. 
units) C.D. > 30%) 

112 Industrial, 4212 Semi-natural 


commercial and 
military units 


grassland without 
woody plants (C. 


C.D. < 30%) 
121 Road networks and 422 Alpine and 
associated land sub-alpine natural 
grassland 
122 Railways and 511 Heathland and 
associated land Moorland 
123 Port areas and 512 Other scrub land 
associated land 
124 Airports and 621 Beaches and dunes 


associated land 


131 Mineral extraction, 622 River banks 
dump and 
construction sites 
132 Land without 631 Bare rocks and 
current use rock debris 
211 Arable land 632 Burn areas (except 
burnt forest) 
212 Greenhouses 633 Glaciers and 
perpetual snow 
221 Vineyards, fruit 721 Exploited peat bog 
trees and berry 
plantations 
222 Olive groves 722 Unexploited peat 
bog 
231 Annual crops 811 Coastal salt 


associated with marshes 


permanent crops 


(continued) 


Code 
232 


233 


234 


311 


312 


321 


322 


331 


332 


341 


342 


Label 


Complex 
cultivation patterns 


Land principally 
occupied by 
agriculture with 
significant areas of 
natural vegetation 


Agro-forestry 


Natural and 
semi-natural 
broadleaved forest 


Highly artificial 
broadleaved 
plantations 


Natural and 
semi-natural 
coniferous forest 


Highly artificial 
coniferous 
plantations 


Natural and 
semi-natural mixed 
forest 


Highly artificial 
mixed plantations 
Transitional 


woodland and 
scrub 


Lines of threes and 
scrubs 


Practical considerations 


Code 
812 


813 


821 


822 


911 


912 


913 


921 


922 


923 


924 


D. García-Álvarez et al. 


Label 


Salines 


Intertidal flats 


Coastal lagoons 


Estuaries 


Interconnected 
water courses 


Highly modified 
water courses and 
canals 


Separated water 
bodies belonging 
to the river system 


Natural water 
bodies 


Artificial standing 
water bodies 


Intensively 
managed fish 
ponds 


Standing water 
bodies of 
extractive 
industrial sites 


N2K files are very heavy (over 2gb), which means that they 
may be difficult to use for those without powerful computers. 
The map can also be consulted online in a viewer included in 


the download website of the product. 


General Land Use Cover Datasets for Europe 
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8 Riparian Zones Land Cover/Land Use—Riparian Zones (RZ) 


Website of reference 
https://land.copernicus.eu/local/riparian-zones 


Download site 


Product 
LULC general 
Dates 

2012, 2018 
Formats 
Vector 

Scale 


Photointerpretation scale: 1:10,000 
Minimum Mapping Unit: 0.5 ha 
Minimum Mapping Width: 10 m 


Thematic resolution 


56 classes: 
11 (a), 6 (ag), 16 (v), 10 (m), 0 (na) 


Compatible legends 

Urban Atlas, N2K, Coastal Zone product 
Extent 

Europe (39 countries) 

Updating 

Not expected 

Change detection 

No (only one date) 

Overall accuracy 

Expected to be >85% 

Website Language English, French and German 


https://land.copernicus.eu/local/riparian-zones/land-cover-land-use-lclu-image?tab-download 


Availability 
Open Access after registration 


Technical documentation 


Format(s) 


.shp 


Tamame et al. (2018), Vandeputte et al. (2018), Weissteiner et al. (2016) 


Other references of interest 
Piedelobo et al. (2019), Ugille (2019) 
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Project 


Riparian Zones (RZ) is one of the local datasets produced as 
part of the Copernicus Land Monitoring Programme. It 
focuses on riparian areas (i.e. transitional areas between land 
and freshwater ecosystems with very specific characteristics) 
associated with Strahler level 2-8 rivers. 

This product was created to support the Mapping and 
Assessment of Ecosystems and their Services (MAES) 
within the context of the EU Biodiversity Strategy for 2020. 
It is also intended for use in relation to the Habitats, Birds 
and Water Framework Directives. 

The Riparian Zones dataset was initially launched in 
2015, with an extension in 2017/18 to include riparian areas 
from Strahler 2 rivers. In 2017 the classification scheme was 
adapted to make it compatible with other local products 
developed under the Copernicus Land Monitoring 
framework. 

Together with the LUC map of riparian zones, two extra 
complementary products are also provided: a delineation of 
Riparian Zones based on a fuzzy modelling approach and an 
inventory of the Green Linear Elements (hedgerows and 
lines of trees) growing in those riparian areas. 


Database 


Riparian zones land cover land use 2012—LCLU (vector) 


ID DU ID MAES 1 MAES 2 MAES 3 
1 1 DUO20A 1 11 
2 2 DU020A 1 1 
3 3 DU020A 1 11 


— ID: Unique identifier for each polygon 

— DU ID: Mapped area Code 

— MAES I: MAES class Level 1 

— MAES 2: MAES class Level 2 

— MAES 3: MAES class Level 3 

— MAES 4: MAES class Level 4 

- UA 

— AREA HA: Area of the polygon, in hectares 


D. García-Álvarez et al. 


Production method 


The RZ map was obtained through semi-automatic classifi- 
cation of very high-resolution imagery captured by the 
SPOT and Pleiades satellites (1.5—2.5 m). This classification 
was later refined with the aid of visual interpretation and 
intersected with the following auxiliary datasets: CORINE 
Land Cover, Imperviousness HRL, Tree Cover Density HRL 
and Urban Atlas. 


Product description 


A different vector file is provided for each riparian area 
mapped. Downloads include the vector file with LUC 
information and pdf documents with information about the 
product. 


Downloads 


Riparian zones land cover land use 2012 (vector) 


— Vector files with LUC information (Data folder) 

— PDF with nomenclature guidelines (Documents folder) 
— PDF with product specifications (Documents folder) 

— Metadata files (.xml) (Metadata folder) 


MAES 4 UA AREA HA NODATA COMMENT 
111 1111 0,73018853235 0 
111 1111 0,90258671155 0 
111 1111 0,64638451960 0 


— NODATA: Unclassifiable areas due to clouds, shadows, snow, haze or missing data 


— COMMENT: Comment field for additional information 


General Land Use Cover Datasets for Europe 


Legend and codification 


The Riparian Zones dataset was produced following a 
hierarchical classification legend made up of three different 
levels, the most detailed of which is provided here. Infor- 
mation about the other levels of classification of LUC cat- 


egories 


can be found in 


accompanying the dataset. 


Code 


1111 


1112 


1113 


112 


121 


122 


123 


124 


131 


132 


14 


211 


212 
221 


222 


Label 


Continuous Urban 
Fabric (IM.D > 
8096) 


Dense Urban Fabric 
(IM.D > 30-80%) 


Low Density Fabric 
(IM.D <30%) 


Industrial, 
commercial and 
military units 


Road networks and 
associated land 


Railways and 
associated land 


Port areas and 
associated land 


Airports and 
associated land 


Mineral extraction, 
dump and 
construction sites 


Land without current 
use 


Green urban, sports 
and leisure facilities 


Arable land 


Greenhouses 


Vineyards, fruit trees 
and berry plantations 


Olive groves 


the 


Code 


41 


421 


422 


511 


512 


52 


6l 


621 


622 


631 


632 


633 


71 
721 


722 


technical documents 


Label 


Managed 
grassland 


Semi-natural 
grassland 


Alpine and 
sub-alpine 
natural grassland 


Heathland and 
Moorland 


Other scrub land 


Sclerophyllous 
vegetation 


Sparsely 
vegetated areas 


Beaches and 
dunes 


River banks 


Bare rocks and 
rock debris 


Burnt areas 
(except burnt 
forest) 


Glaciers and 
perpetual snow 


Inland marshes 


Exploited peat 
bog 

Unexploited peat 
bog 


(continued) 


Code 
231 


232 


233 


234 


311 


312 


321 


322 


331 


332 


341 


342 


35 


Label 


Annual crops 
associated with 
permanent crops 


Complex cultivation 
patterns 


Land principally 
occupied by 
agriculture with 
significant areas of 
natural vegetation 


Agro-forestry 


Natural and 
semi-natural 
broadleaved forest 
Highly artificial 
broadleaved 
plantations 


Natural and 
semi-natural 
coniferous forest 
Highly artificial 
coniferous 
plantations 


Natural and 
semi-natural mixed 
forest 

Highly artificial 
mixed plantations 


Transitional 
woodland and scrub 


Lines of trees and 
scrub 


Damaged forest 


Practical considerations 


Code 
811 


812 


813 


821 


822 


911 


912 


913 


921 


922 


923 


924 


10 
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Label 


Coastal salt 
marshes 


Salines 


Intertidal flats 


Coastal lagoons 


Estuaries 


Interconnected 
water courses 


Highly modified 
water courses 
and canals 


Separated water 
bodies belonging 
to the river 
system 


Natural water 
bodies 


Artificial 
standing water 
bodies 


Intensively 
managed fish 
ponds 


Standing water 
bodies of 
extractive 
industrial sites 


Sea and ocean 


The map can also be consulted online in a viewer available 


on the download site (see link above). 


338 D. García-Álvarez et al. 


9 Coastal Zones 


Product 
LULC general 
Dates 

2012, 2018 
Formats 
Vector 

Scale 


1:10,000 
Minimum Mapping Unit: 0.5 ha 
Minimum Mapping Width: 10 m 


Thematic resolution 


71 classes: 


19 (a), 6 (ag), 17 (v), 9 (m), 0 (na) 
Compatible legends 

Urban Atlas, N2K, Riparian Zones 

Extent 

Coastlines of EEA member states (39 countries) 


Updating 
Yes 


Change detection 


Yes, through the change layer 
Overall accuracy 

Expected to be >85% 
Website of reference Website Language English, German and French 
https://land.copernicus.eu/local/coastal-zones 

Download site 

https://land.copernicus.eu/local/coastal-zones 

Availability Format(s) 

Open Access after registration .gdb, .gpkg 

Technical documentation 

European Environment Agency (2021) 


Other references of interest 


General Land Use Cover Datasets for Europe 


Project 


The Coastal Zones Land Cover/Land Use dataset is pro- 
duced by the European Environment Agency (EEA) as part 
of the Copernicus Land Monitoring Service (CLMS). The 
dataset has been developed in collaboration with the 
Copernicus Marine Environment Monitoring Service 
(CMEMS) and representatives from the potential community 
of users. 

It is specifically intended for monitoring coastal areas and 
provides an important source of information for all EU 
policies dealing with coastal management and maritime 
spatial planning. 

The dataset maps, at very detailed scale, the land uses and 
covers in coastal areas in the 39 countries belonging to the 
EEA. The coastal area mapped is defined by a 10 km inland 
buffer zone and the Corine Land Cover (CLC) seawards 
buffer zone. Relevant estuaries, coastal lowlands and nature 
reserves that extend beyond the buffer zone have also been 
included. 

The dataset's classification legend has been specifically 
designed to fit the needs of its user community. It is based on 
the Mapping and Assessment of Ecosystems and their Ser- 
vices (MAES) ecosystem typology and makes the product 
compatible with other CLMS local monitoring datasets, such 
as Urban Atlas, Riparian Zones and N2K. 

The dataset is composed of two LUC maps for the ref- 
erence years 2012 and 2018, plus a change layer for the 
period 2012-2018. The dataset will be updated every 6 
years, in accordance with the CLC production timeline. 


Database 


Coastal zones 2018 (Geodatabase) 


fid ID < 
1 1 14 1 11 111 1111 


2 2 24 1 11 111 1111 
3 3 34 1 11 111 1112 
4 4 44 1 11 111 1111 


— fid: Identifier for each polygon 

— ID: Unique identifier for each polygon 

— DU: 

— CODE 1. 18: LUC category for the Level 1 classification legend 
— CODE 2 18: LUC category for the Level 2 classification legend 
— CODE 3. 18: LUC category for the Level 3 classification legend 
— CODE 4. 18: LUC category for the Level 4 classification legend 
— CODE 5. 18: LUC category for the Level 5 classification legend 
— COMMENT 18: Comments on the mapping 

— NODATA 18: Objects with no data in 2018 

— AREA HA: Area of the polygon, in hectares 

— Shape Length: Perimeter of the polygon, in metres 

— Shape Area: Area of the polygon, in square metres 


DU CODE 118 CODE 2 18 CODE 3 18 CODE 4 18 CODE 5 18 COMMENT. 18 NODATA, 18 
11110 


11110 ^ 
11120 / 


11110 ^ 
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Production method 


The Coastal Zones Land Cover/Land Use dataset is pro- 
duced via computer-assisted photointerpretation of very high 
spatial resolution (1.5—4 m) imagery from a wide variety of 
missions: SPOT, Pléiades, WorldView, SuperView, 
KOMPSat, Planet Dove, Deimos and TripleSat. A variable 
photointerpretation scale (1:5,000-1:10,000) was selected 
depending on the mapped landscape and feature character- 
istics. The following auxiliary datasets were also used in 
support of the photointerpretation process: CLC, Urban 
Atlas, HRL, Bing Maps and different imagery sources 
(DWH. MG2 CORE 01 Coverage, Sentinel-2, Landsat-8, 
national aerial imagery, Google Earth). 


Product description 


Users can download the Coastal Zones dataset in two different 
formats: Geodatabase and GeoPackage. Different download 
files are available for each year of reference (2012, 2018) as 
well as for the change layer (2012-2018). All downloads 
include the same information: layers with LUC information, a 
style file for their symbolization in GIS and auxiliary data. 


Downloads 


Coastal zones 2018 (Geodatabase) 


— Geodatabase files with Coastal Zones vector layers 

— Layer style files for ArcGIS (.lyr), QGIS (.qml) and any other GIS 
software (.sld) (Symbology folder) 

— Metadata file (.xml, .gfs) (Metadata folder) 


AREA HA Shape Length Shape Area 
O 0.907902455860... 448.5645642872... 9079.024558604... 
O 2.709284033416... 664.5918333262.. 27092.84033416... 


O 1.804702394212... 565.373960242318 18047.02394212... 


0 2.832704905509... 674.9293361490.. 28327.0490550978 
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Coastal zones change 2012-2018 (Geodatabase) 


1 


2 


3 


4 


OBJECTID < ID 
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— fid: Identifier for each polygon 
— ID: Unique identifier for each polygon 


— DU: 


— CODE 1 12: 
— CODE 2 12: 
— CODE 3 12: 
— CODE 4 12: 
— CODE 5 12: 
— CODE 1 18: 
— CODE 2 18: 
— CODE 3 18: 
— CODE 4 18: 
— CODE 5 18: 
— COMMENT: 


DU 


— NODATA 12: Objects with no data in 2012 
— NODATA 18: Objects with no data in 2018 
— AREA HA: Area of the polygon, in hectares 
— Shape Length: Perimeter of the polygon, in metres 
— Shape Area: Area of the polygon, in square metres 


Legend and codification 


CODE 1 12 


The Coastal Zones dataset was produced following a hier- 
archical classification legend made up of five different levels, 
the most detailed of which is provided here. Information 
about the full classification scheme, including the five dif- 
ferent levels, can be found in the technical documentation 
accompanying the dataset. 


Code 
11110 


11120 


11130 


11210 


11220 


12100 


12200 


Label 


Continuous urban 
fabric IMD > 80%) 


Dense urban fabric 
(IMD > 30-80%) 


Low-density fabric 
(IMD < 30%) 


Industrial, 
commercial, public 
and military units 
(other) 


Nuclear energy plants 
and associated land 


Road networks and 
associated land 


Railways and 
associated land 


Code 
36000 


41000 


42100 


42200 


51000 


52000 


53000 


Label 


Damaged forest 


Managed 
grassland 


Semi-natural 
grassland 


Alpine and 
sub-alpine 
natural grassland 


Heathland and 
moorland 


Alpine scrub 
land 


Sclerophyllous 
scrubs 


(continued) 


CODE 2. 12 


LUC category for the Level 1 classification legend in 2012 
LUC category for the Level 2 classification legend in 2012 
LUC category for the Level 3 classification legend in 2012 
LUC category for the Level 4 classification legend in 2012 
LUC category for the Level 5 classification legend in 2012 
LUC category for the Level 1 classification legend in 2018 
LUC category for the Level 2 classification legend in 2018 
LUC category for the Level 3 classification legend in 2018 
LUC category for the Level 4 classification legend in 2018 
LUC category for the Level 5 classification legend in 2018 
Comments on the mapping 


Code 
12310 


12320 


12330 


12340 


12350 


12360 


12370 


12400 


13110 


13120 


13130 
13200 


CODE 3 12 CODE 4 1 
111 
111 
111 


111 


Label 
Cargo port 


Passenger port 


Fishing port 
Naval port 
Marinas 


Local 
multi-functional 
harbours 


Shipyards 


Airports and 
associated land 


Mineral extraction 
sites 


Dump sites 
Construction sites 


Land without current 
use 


2 


1111 


1112 


1112 


1112 
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CODE 5 12 CODE 1 18 
11110 1 
11120 1 
11120 1 
11120 4 
Code Label 
61100 Sparse 
vegetation on 
sands 
61200 Sparse 
vegetation on 
rocks 
62111 Sandy beaches 
62112 Shingle beaches 
62120 Dunes 
62200 River banks 
63110 Bare rocks and 
outcrops 
63120 Coastal cliffs 
63200 Burnt areas 
(except burnt 
forest) 
63300 Glaciers and 
perpetual snow 
71100 Inland marshes 
71210 Exploited peat 
bogs 
(continued) 
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Code 
14000 


21100 


21200 
22100 


22200 


23100 


23200 


23300 


23400 


31100 


31200 


32100 


32200 


33100 


33200 


34000 


35000 


Label 


Green urban, sports 
and leisure facilities 


Arable irrigated and 
non-irrigated land 


Greenhouses 


Vineyards, fruit trees 
and berry plantations 


Olive groves 


Annual crops 
associated with 
permanent crops 


Complex cultivation 
patterns 


Land principally 
occupied by 
agriculture with 
significant areas of 
natural vegetation 


Agro-forestry 


Natural & 
semi-natural 
broadleaved forest 
Highly artificial 
broadleaved 
plantations 


Natural & 
semi-natural 
coniferous forest 
Highly artificial 
coniferous plantations 
Natural & 
semi-natural mixed 
forest 

Highly artificial 
mixed plantations 


Transitional 
woodland and scrub 


Lines of trees and 
scrub 


Code 
71220 


72100 


72200 
72300 


81100 


81200 


81300 


82100 


82200 


82300 


82400 


83100 


83200 


83300 


84100 


84200 


Label 


Unexploited 
peat bogs 


Salt marshes 


Salines 


Intertidal flats 


Natural & 
semi-natural 
water courses 
Highly modified 
water courses 
and canals 


Seasonally 
connected water 
courses 
(oxbows) 


Natural lakes 


Reservoirs 


Aquaculture 
ponds 


Standing water 
bodies of 
extractive 
industrial sites 


Lagoons 


Estuaries 


Marine inlets 
and fjords 


Open sea 


Coastal waters 
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Practical considerations 


Coastal Zones files are very heavy (above 3gb), which 
means that the dataset may be difficult to use for those 
without powerful computers. The dataset can also be con- 
sulted online using the viewers available when downloading 
the different layers. 
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10 S2GLC 2017—Sentinel-2 Global Land Cover 2017 


Website of reference 
http://s2glc.cbk.waw.pl/extension 
Download sites 


http://s2glc.cbk.waw.pl/extension 
https://finder.creodias.eu/ 


Availability 
Open Access 


Technical documentation 


Product 

LULC general 
Dates 

2017 

Formats 

Raster 

Pixel size 

10m 

Thematic resolution 


13 classes: 
1 (a), 2 (ag), 5 (v), 0 (m), 0 (na) 


Compatible legends 
CLC 

Extent 

Europe, except Russia 
Updating 

Not expected 


| Change detection 


No (only one date) 

Overall accuracy 
Expected to be >86% 
Website Language English 


Format(s) 
tiff 


Gromny et al. (2019a, b), Kukawska et al. (2017), Malinowski et al. (2019), Nowakowski et al. (2017) 


Other references of interest 


D. Garcia-Alvarez et al. 


General Land Use Cover Datasets for Europe 
Project 


Sentinel-2 Global Land Cover (S2GLC) was a project fun- 
ded by the European Space Agency (ESA) in order to create 
an automatic methodology to globally map LUC at high 
resolution from Sentinel-2 imagery. The project was led by 
the Space Research Centre of the Polish Academy of Sci- 
ences (CBK PAN). Its main output is the S2GLC 2017 map. 

The project was developed in two phases. In the first 
phase, the proposed methodology was tested in five proto- 
type sites: Germany, Italy, China, Columbia and Namibia. In 
the second phase, the methodology was adjusted to map 
LUC for the whole of Europe, except Russia, Belarus and 
Ukraine. 


Production method 


S2GLC was obtained by classifying Sentinel-2 imagery. 
Each Sentinel-2 scene was individually classified using a set 
of multi-temporal images through a random forest classifier. 
Training data was automatically extracted from existing 
datasets, such as CORINE Land Cover. A set of probability 
rasters were obtained from the random forest classifier, and 
the class finally selected for each pixel was the one with the 
highest probability over the whole time series. 
A post-classification step was applied for those pixels with 
low probabilities. 


Product description 


S2GLC 2017 can be downloaded as a single file or in tiles. 
In the first case, users can choose to download the raster 
LUC file, either symbolized (RGB GeoTiff file) or not 
(GeoTiff file). Users who opt to download a tile from the 
map will automatically download both types of rasters. 
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Downloads 


European land cover map (single-band file, RGB file) 


— Raster file with LUC map 

— TXT file with the product legend 

— PDF with technical information about the product (tiles 
decomposition) 


Single tile 


— Raster file with LUC map 

— Coloured raster file with LUC map 
— Preview image of the product 

— TXT file with the product legend 


Legend and codification 


Code Label Code Label 
62 Artificial surfaces and 104 Sclerophyllous 
constructions vegetation 

73 Cultivated areas 105 Marshes 

75 Vineyards 106 Peatbogs 

82 Broadleaf tree cover 121 Natural material 
surfaces 

83 Coniferous tree cover 123 Permanent 
snow-covered 
surfaces 

102 Herbaceous vegetation 162 Water bodies 

103 Moors and Heathland 


Practical considerations 


The map can be consulted online using the CREODIAS 
Browser application (https://browser.creodias.eu/). Single 
file download options involve very heavy files (8—16.2 gb), 
for which a powerful computer will be required. 
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Abstract 


Several general Land Use Cover (LUC) datasets are 
available for Africa. They provide a general picture of the 
land uses and covers in more than one African country, 
rather than focusing on any specific type. In this chapter, 
we review six datasets of this kind. Only one (CCI 
LAND COVER - S2 PROTOTYPE, 30 m) covers the 
whole continent, while the others map certain specific 
regions of Africa. All these datasets have been produced 
within the context of specific projects, usually sponsored 
by international organizations such as the European Space 
Agency (ESA), the Food and Agriculture Organization 
(FAO) or the National Aeronautics and Space Adminis- 
tration (NASA). Once these projects come to an end, no 
new updates of the maps were published, which limits the 
potential and the temporal resolution of the available 
datasets. For Africa, only the West Africa Land Use Land 
Cover (2 km) and the SERVIR-ESA (30 m) provide a 
time series of LUC maps. The first provides maps for 
three reference years (1975, 2000, 2013), while in the 
second the number of maps available and their respective 
reference years vary from country to country: from 2 to 4 


different editions issued between 1990 and 2015. AFRI- 
COVER (1:200,000) and the Congo Basin Vegetation 
Types dataset (300 m) provide LUC information for just 
one reference year, although they were created from 
imagery covering a long time-span: 1994—2001 for 
AFRICOVER and 2000-2007 for Congo Basin Vegeta- 
tion Types. The SADC Land Cover Database (1:250,000) 
was obtained by merging and harmonizing national and 
regional LUC datasets. As a result, the reference year 
varies from one country to the next, always between 1990 
and 1997. The CCI LAND COVER - S2 PROTOTYPE 
was produced at the highest spatial resolution of all the 
datasets reviewed in this chapter (30 m). It also provided 
the most comprehensive, most updated LUC image of 
Africa, with information for the year 2015/16. 
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1 West Africa Land Use Land Cover 


Product 

LULC general 
Dates 

1975, 2000, 2013 
Formats 

Raster 

Pixel size 

2 km 

Thematic resolution 


30 classes: 
2 (a), 5 (ag), 12 (v), 3 (m), 3 (na)! 


Compatible legends 

None 

Extent 

West Africa and Cape Verde 
Updating 

Not expected 

Change detection 

Yes 


Overall accuracy 


Not specified 


Website of reference Website Language English 
https://eros.usgs.gov/westafrica/ 
Download site 


https://eros.usgs.gov/westafrica/data-downloads 
https://www.sciencebase.gov/catalog/item/5deffc05e4b02caea0f4f3fc 


Availability Format(s) 
Open Access „tiff 
Technical documentation 

CILSS (2016) 

Other references of interest 

Cotillon (2017), Cotillon and Mathis (2017) 


! (a): artificial; (ag): agriculture; (v): vegetation; (m): mixed classes; 
(na): no data. 


General Land Use Cover Datasets for Africa 
Project 


West Africa Land Use Dynamics was a project led by the 
AGRHYMET Regional Centre in collaboration with the 
Sahel Institute (INSAH), the USGS Earth Resources 
Observation and Science (EROS) and the US Agency for 
International Development (USAID). 17 different countries 
took part: Benin, Burkina Faso, Cape Verde, Chad, Ivory 
Coast, Gambia, Ghana, Guinea, Guinea-Bissau, Liberia, 
Mali, Mauritania, Niger, Nigeria, Senegal, Sierra Leone and 
Togo. 

As a result of the project, a LUC map series was created 
to monitor natural and environmental trends in the West 
Africa region. The dataset is part of a wider effort to create 
an atlas about landscape and environmental changes in West 
Africa. 


Production method 


West Africa Land Use Land Cover was obtained through 
photointerpretation of Landsat imagery with the Rapid Land 
Cover Mapper (RLCM) tool at a spatial resolution of 2 km. 
Gambia was photointerpreted at a spatial resolution of 2 km 
and Cape Verde at 500 m. Photointerpretation guidelines 
were developed specifically for the task. 


Product description 


Users can download a separate edition of the West Africa 
Land Use Land Cover dataset for each year of reference. In 
each case, the download includes the raster file with the LUC 
map as well as a file to symbolize the raster in GIS. An Excel 
file with the legend can also be downloaded from the website 
together with the detailed metadata files for each LUC map. 
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Downloads 
West Africa Land Use Land Cover (2013) 


— Raster file with LUC map 
— File to symbolize the raster in GIS (.clr) 


Legend and codification 


Code Label Code Label 
0 No data 15 Gallery forest and 
riparian forest 
1 Forest 16 Shrub and tree 
savanna 
2 Savanna 21 Degraded forest 
3 Wetland — 22 Bowé 
floodplain 
4 Steppe 23 Thicket 
Oasis 24 Agriculture in 
bottomlands and 
flood recessional 
6 Plantation 25 Woodland 
Mangrove 27 Cropland and 
fallow with oil 
palms 
Agriculture 28 Swamp forest 
9 Water bodies 29 Sahelian short 
grass savanna 
10 Sandy area 31 Herbaceous 
savanna 
11 Rocky land 32 Shrubland 
12 Bare soil 78 Open mine 
13 Settlements 98 Cloud shadow 
14 Irrigated 99 Cloud 


agriculture 
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2 SERVIR-ESA—SERVIR Eastern and Southern Africa 


Prcoduct 
LULC general 
Dates 


1990, 2000, 2010 (Malawi) 

1990, 2000, 2010, 2015 (Rwanda) 

2000, 2010 (Botswana, Namibia, Tanzania, Zambia) 
2000, 2014 (Lesotho, Uganda) 

2003, 2008 (Ethiopia) 


Formats 

Raster 

Pixel size 

30 m 

Thematic resolution 

7 classes: 

1 (a), 1 (ag), 2 (v), O (m), 1 (na) 
Compatible legends 

IPCC 

Extent 

Eastern and Southern Africa 
Updating 

No updating confirmed 

Change detection 

Yes, but potential uncertainties have not been specified 


Overall accuracy 


Expected to be 26396 


Website of reference Website Language English, Spanish, French 


https://www.servirglobal.net/ServiceCatalogue/details/ 
5bd052d451ebdcae79683375 


Download site 

http://geoportal.rcmrd.org/layers/?limit=100&offset=0 

Availability Format(s) 
Open Access tiff 
Technical documentation 

Oduor et al. (2016) 

Other references of interest 

Al-Hamdan et al. (2017), Searby et al. (2019) 


General Land Use Cover Datasets for Africa 
Project 


SERVIR is an initiative led by the National Aeronautics and 
Space Administration (NASA) and the United States Agency 
for International Development (USAID) that aims to help 
developing countries to produce geospatial information 
suitable for climate risks and land use management. SER- 
VIR operates in West Africa, Eastern and Southern Africa, 
Hindu Kush Himalaya, the Lower Mekong, South America 
and Mesoamerica. 

In Eastern and Southern Africa in 2008, SERVIR 
started a project in partnership with the Kenya-based 
Regional Centre for Mapping of Resources for Develop- 
ment (RCMRD). Training, geospatial tools and geospatial 
datasets were developed as part of the project, including a 
dataset specifically aimed at LUC monitoring. Six countries 
were initially mapped (Botswana, Malawi, Namibia, 
Rwanda, Tanzania, and Zambia), with three more countries 
participating since 2014/15 (Ethiopia, Uganda, and 
Lesotho). 

As a result of this project, a LUC map covering all 
9 countries was developed. National LUC maps, with 
detailed national legends, were also provided as part of the 
project. 


Production method 


SERVIR-ESA was produced by aggregating LUC maps 
created at the national level according to the same 7-class 
classification scheme. For each country, a map with a legend 
adapted to the country’s specificities was also developed 
following the same general guidelines. 

The maps were obtained through supervised classification 
of Landsat imagery through a Maximum likelihood classi- 
fier. Auxiliary spatial and non-spatial data were also used in 
the classification. Settlements were manually photointer- 
preted from Google Earth imagery. 
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Errors and uncertainties in the classification resulting 
from this process were corrected in a post-classification step, 
which included expert review. 


Product description 


The SERVIR-ESA LUC map is distributed at national level. 
For each country, users can download the harmonized map 
for all Eastern and Southern Africa (Scheme I) or the specific 
LUC map with a detailed legend for the selected country 
(Scheme II). 


Downloads 


Scheme I product/Scheme II product 
— Raster file with coloured LUC map 


Legend and codification 


In this description, we only include the general 7-class 
legend adopted for all the LUC maps. However, a specific 
legend is available for each national map, which can be 
consulted online. 


Scheme I legend 


Code Label Code Label 

0 Non data 4 Wetland 

1 Forestland 5 Settlement 
2 Grassland 6 Other land 
3 Cropland 


Practical considerations 


The maps for each country were usually produced at dif- 
ferent dates, so making inter-country comparison difficult. 
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3 SADC Land Cover Database 


Product 
LULC general 
Dates 


1990 / 91 (Malawi) 
1997 (Tanzania, Zimbabwe) 
1999 (Mozambique, South Africa, Lesotho, and Swaziland) 


Formats 


Vector 


Scale 
1:250,000 


Thematic resolution 


13 classes: 
1 (a), 2 (ag), 5 (v), 0 (m), 1 (na) 
Compatible legends 


None 
Extent 


Southern African Development Community (Lesotho, Malawi, 
Mozambique, South Africa, Swaziland, Tanzania, Zimbabwe) 


Updating 
No 
Change detection 


No (only one date) 


Overall accuracy 


Not specified 


Website of reference Website Language English 


http://gsdi.geoportal.csir.co.za/projects 

Download site 

http://gsdi.geoportal.csir.co.za/projects 

Availability Format(s) 
Open Access .shp 


Technical documentation 


Other references of interest 


General Land Use Cover Datasets for Africa 
Project 


The SADC Land Cover Database is fruit of a project funded 
by the South African Department of Arts, Culture, Science 
and Technology (DACST) through the Regional Science and 
Technology Programme. It was coordinated by the Council 
for Scientific and Industrial Research (CSIR) in South 
Africa, with the participation of organizations from the dif- 
ferent countries being mapped. 

The objective of the project was to deliver a coherent 
Land Use Cover map covering the Southern African 
Development Community (SADC) region. The project 
builds on earlier LUC mapping work carried out at national 
and regional scales for each of the mapped countries. 

The map covers those SADC countries that already had a 
LUC dataset available for their territory: Lesotho, Malawi, 
Mozambique, South Africa, Swaziland, Tanzania, Zim- 
babwe. The other countries in the region are not included in 
the map. 


Production method 


The SADC Land Cover Database was obtained by harmo- 
nizing and fusing the different national and regional LUC 
datasets. All the datasets were originally obtained by clas- 
sification or photointerpretation of Landsat imagery, 
although the reference years vary from country to country. 

The maps were combined by resampling to a spatial 
resolution of 1 km, before being reclassified according to the 
same classification system. This reduced the detail of the 
original maps, a deliberate action to avoid copyright and 
commercialisation issues. 
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Product description 


The dataset is downloaded as a single compressed file (.zip), 
which includes the vector LUC map, a metadata file and a 
complete map (i.e. with colours, graphics, scale and legend) 
in jpg format that is ready to print out. 


Downloads 
SADC 
— Vector file with LUC map (.shp) 


— Edited map in a non-modifiable format (.jpg) 
— Metadata file (.html) 


Legend and codification 


Label Label Label 

Forest Bare ground Open water 
Woodland Plantation Wetland 
Bushland Cultivation Ice-cap/Snow 
Low shrubland Built-on Not classified 


Grassland 


Practical considerations 


A detailed description of the map categories is available in 
the dataset's metadata. The map's production method entails 
certain limitations and uncertainties, in that each country has 
been mapped by a different team, using different sources of 
imagery for different reference years. Inconsistencies may 
therefore arise when comparing information between 
countries. 
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4 AFRICOVER 
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Product 
LULC general 
Dates 


1994/01 
(the reference year varies according to the country) 


Formats 


Vector 


Scale 
1:200,000 


Thematic resolution 


8 classes: 
2 (a), 2 (ag), 2 (v), 0 (m), 0 (na) 


Compatible legends 
FAO-LCCS 

Extent 

Africa 

Updating 

No 

Change detection 


No (only one date) 


Overall accuracy 


Expected to be > 80% 


Website of reference 


Website Language English 


http://www .fao.org/geospatial/projects/detail/en/c/1035404/ 


Download site 


http://www .fao.org/geospatial/projects/detail/en/c/1035404/ 


Availability 


| Format(s) 


Open Access 


| .shp 


Technical documentation 


Di Georgio and Jansen (1996), FAO (1997) 


Other references of interest 


Di Gregorio (2009), Kalensky (1998), Latham et al (2002) 


General Land Use Cover Datasets for Africa 
Project 


AFRICOVER was a project led and coordinated by the Food 
and Agriculture Organization (FAO) of the United Nations, 
which aimed to create georeferenced data for the African 
continent. The FAO helped the different countries and 
regions to develop their reference maps, establishing the 
standards for the final product. Twelve countries participated 
in the project (Burundi, Democratic Republic of Congo, 
Egypt, Eritrea, Kenya, Rwanda, Somalia, Sudan, Tanzania, 
Uganda, Libya and Malawi) which therefore required 
extensive coordination of many national and regional teams 
across Africa. 

A keystone of the project was the production of LUC 
maps for Africa. In addition to LUC maps, other georefer- 
enced data were created for a range of themes: hydrology, 
geomorphology, demography... 


Production method 


The production of AFRICOVER was decentralised at a 
national and regional level. Although the FAO defined the 
guidelines and standards for the product, national and 
regional teams from each country were responsible for its 
execution. This meant that although a set of common char- 
acteristics regarding the production of AFRICOVER had 
been established for all the countries involved, certain 
specificities could also arise. 

AFRICOVER LUC maps were mainly obtained through 
photointerpretation of satellite imagery, of which Landsat 
was the main source. The photointerpretation scale was 
1:200,000. When drawing LUC polygons, the FAO LCSS 
classification scheme was followed. The FAO provided 
national and regional teams with specific software and 
training to carry out LUC mapping according to this 
approach. 
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Product description 


AFRICOVER LUC maps are distributed at a national level. 
A compressed file can be downloaded for each country. This 
includes the vector LUC map and a legend description to 
help users interpret it. 


Downloads 


Land cover folder 


— Vector file with LUC map (.shp) 
— PDFs describing the classification legend 
— Excel file with the classification legend 


Legend and codification 


Label Label 


Artificial Surfaces and 
Associated Areas 


Cultivated Terrestrial Areas and 
Managed Lands 


Natural and Seminatural Bare Areas 


Terrestrial Vegetation 


Cultivated Aquatic or Regularly Artificial Waterbodies 


Flooded Areas 


Natural and Seminatural Aquatic Inland Waterbodies 


Vegetation 
Practical considerations 


AFRICOVER LUC maps have been created following the 
FAO LCSS classification scheme. This means that each 
LUC polygon is described through a specific code that 
identifies the general cover of the polygon and characterizes 
it through a series of labels. Users may find this system 
difficult to understand, as it does not follow a common 
hierarchical classification legend in which each polygon is 
defined by a single category. 
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5 CCI LAND COVER - S2 PROTOTYPE 


Product 

LULC general 
Dates 

2016 

Formats 

Raster 

Pixel size 

20m 

Thematic resolution 


10 classes: 
1 (a), 1(ag), 5 (v), 0 (m), O (na) 


Compatible legends 

FAO-LCCS 

Extent 

Africa 

Updating 

Expected, but no specific date has been set 
Change detection 

No (only one date) 


Overall accuracy 


Expected to be >65% 


Website of reference Website Language English 
http://201 6africalandcover20m.esrin.esa.int/ 

Download site 

http://201 6africalandcover20m.esrin.esa.int/download.php 

Availability Format(s) 

Open Access after registration „tiff 

Technical documentation 

Lasiv et al. (2017) 


Other references of interest 


General Land Use Cover Datasets for Africa 
Project 


The CCI LAND COVER - S2 PROTOTYPE map is part of 
the Land Cover — Climate Change Initiative led by the 
European Space Agency (ESA). The purpose of this initia- 
tive is to deliver Land Cover products that meet the 
requirements of the climate change research community. 
The map was created as a prototype to collect feedback 
from users for future improvements. At the time it was 
released, it was the highest spatial resolution LUC map 
covering the whole African continent and one of the few 
products providing consistent LUC coverage for all of Africa. 


Production method 


The map was obtained after classification of Sentinel-2A 
imagery for the reference year 2016. Two different classifi- 
cations were carried out, through random forest and machine 
learning classifiers. The final map is a combination of the two 
classifications. Auxiliary datasets were used to map the “open 
water" (extracted from the Global Surface Water product) 
and “urban areas" (extracted from Global Human Settlement 
Layer and the Global Urban Footprint) categories. 


Product description 


CCI LAND COVER - S2 PROTOTYPE is distributed as a 
single compressed file, including the raster with LUC 
information and a style layer to symbolize the map in GIS 


software. The legend is described in two auxiliary files, in 
Excel and pdf. 
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Downloads 


S2 PROTOTYPE LC at 20 m AFRICA 2016 


— Raster file with LUC map 
(“ESACCI-LC-L4-LC10-Map-20 m-P1Y-2016-v1.0") 

— Layer style file for GIS software (.qml) 

— Excel sheet with the map legend 

— PDF describing the map legend 


Legend and codification 


Code Label Code Label 
1 Tree cover areas 6 Lichens and 
mosses 

2 Shrubs cover areas 7 Bare areas 
Grassland Built up areas 

4 Cropland (rainfed 9 Snow and/or Ice 
or irrigated) 

5 Vegetation aquatic 10 Open water 


or regularly flooded 


Practical considerations 


The map is distributed as a single, very heavy file (6 Gb). 
Users with limited computer and internet capacities may find 
it difficult to download and work with this product. 
Nonetheless, a preview tool is available online for any user 
wishing to consult the map. 


358 


6 Congo Basin Vegetation Types 


Website of reference 
http://maps.elie.ucl.ac.be/geoportail/ 
Download site 
http://maps.elie.ucl.ac.be/geoportail/ 
Availability 

Open access 

Technical documentation 
Verhegghen et al. (2012) 


Other references of interest 
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Product 

LULC general 
Dates 

2000 / 07 

Formats 

Raster 

Pixel size 

300m 

Thematic resolution 


20 classes: 
1 (a), 2 (ag), 14 (v), 1 (m), O (na) 


Compatible legends 
FAO LCCS 

Extent 

Congo Basin region 
Updating 

Not expected 

Change detection 

No (only one date) 
Overall accuracy 
Expected to be > 71% 


Website Language English 


Format(s) 
tiff 


General Land Use Cover Datasets for Africa 


Project 


The Congo Basin Vegetation Types map was produced by a 
team of experts from the Université Catholique de Louvain, 
the Joint Research Centre (JRC) of the European Commis- 
sion and the Observatory for the Forests of Central Africa 
(OFAC). 

The map was produced in an attempt to aid forest and 
vegetation monitoring in Central Africa. It provided a spa- 
tially coherent dataset for all the Congo Basin region with 
improved spatial discrimination with respect to previous 
datasets of similar nature. 


Production method 


The Congo Basin Vegetation Types was obtained by unsu- 
pervised classification of imagery composites created from 
the images provided by the MERIS and VEGETATION 
sensors. 

To account for the regional disparities of the mapped area 
and its different cloud coverage, the Congo Basin was split 
into four different zones: North, South, Western Centre and 
Eastern Centre. Seasonal imagery composites were created 
for each specific season in the northern and southern regions. 
In addition, an annual composite was generated for the 
whole mapped area. 

A different classification exercise was performed for each 
mapped zone based on a cluster k-means algorithm. The 
resulting clusters were labelled on the basis of the infor- 
mation provided by reference maps when LUC information 
on these sources covered at least 50% of the identified 
cluster. The rest of the clusters were manually labelled on 
the basis of visual interpretation and expert knowledge. 


Product description 
A compressed file (.zip) containing the raster layer with the 


LUC data can be downloaded, together with other auxiliary 
information to interpret and symbolize the map content. 


Downloads 


Congo Basin Vegetation Types map 


— Raster file with LUC map (.tif) 

— Layer style files for ArcGIS (.lyr) 

— Excel file with the map legend (.xls) 

— Text file with the metadata for the product (.txt) 


Legend and codification 


Code Label Code Label 

1 Dense moist 11 Grassland 
forest 

2 Submontane 12 Aquatic grassland 
forest 

3 Mountain forest 13 Swamp grassland 

4 Edaphic forest 14 Sparse vegetation 
Mangrove 15 Mosaic cultivated 

areas/ vegetation 

6 Forest/savanna 16 Agriculture 
mosaic 

7 Rural complex 17 Irrigated 
(forest area) agriculture 

8 Closed to open 18 Bare areas 
deciduous 
woodland 

9 Savanna 19 Artificial surfaces 
woodland/tree and associated 
savanna areas 

10 Shrubland 20 Water bodies 


Practical considerations 


Users can consult the LUC map online on the Université 
Catholique de Louvain website (http://maps.elie.ucl.ac.be/ 
geoportail/). 
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Abstract | 


In this chapter we review some examples of general Land 
Use Cover (LUC) mapping at a supra-national level in 
America and Asia. These datasets provide a general 
overview of the land uses and covers in specific American 
or Asian regions, without focusing on any particular land 
use or cover. For Asia, we have only identified one 
dataset mapping the Himalayan region, whereas for 
America five different datasets were identified. Only 
three of these are reviewed here, as the other two 
(SERENA, South America 30 m) are not available for 
download. The most ambitious project of all those 
reviewed is NALCMS, which coordinates the production 
of a LUC map for the whole of North America (Canada, 
Mexico, USA) at detailed scales (30-250 m) and using 


the same classification legend. It is the only dataset of all 
those reviewed that provides a time series of LUC maps 
(2005, 2010 and 2015). The Himalaya Regional Land 
Cover database is a vector-based map that provides 
information on LUC changes over the period 1970/80— 
2007 at a scale of 1:350,000. The other two American 
datasets—LBA-ECO LC-08 (1 km, 1987/91) and MER- 
ISAM2009 (300 m, 2008/10)—are raster-based and only 
available for one date, therefore making change detection 
impossible. 
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1 LBA-ECO LC-08—Land Cover Map of South America 


Product 

LULC general / LULC thematic (vegetation) 
Dates 

1987 / 91 

Formats 

Raster 

Pixel size 

1 km 

Thematic resolution 


42 classes: 
1 (a), 1 (ag), 27 (v), 7 (m), 3 (na)! 


Compatible legends 
None 

Extent 

South America 
Updating 

Not expected 
Change detection 
No (only one date) 


Overall accuracy 


Depending on the class. Expected to be >90% for 24 classes covering 85% 
of the map. Classes with an accuracy of <75% only cover 6.5% of the map 


Website of reference Website Language English 
https://daac.ornl.gov/LBA/guides/LCO8 EOS Maps.html£references 

Download site 

https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds id-1155 

Availability Format(s) 

Open Access after registration tiff, .nc, .asc, .nitf, img 
Technical documentation 

Stone et al. (1994) 


Other references of interest 


! (a): artificial; (ag): agriculture; (v): vegetation; (m): mixed classes; 
(na): no data. 


General Land Use Cover Datasets for America and Asia 


Project 


The Large-Scale Biosphere-Atmosphere Experiment in the 
Amazon (LBA) was an international project launched by the 
Brazilian scientific community in 1993. The main objectives 
were to study Amazonia and its role in the earth's ecosystem 
as well as to understand LUC changes in the area and their 
environmental consequences. 

As part of the project, a global LUC map covering South 
America was produced from imagery and data of the period 
1987/91. Vegetation and soil maps for Brazil were also digi- 
talized on the basis of previous resources. These maps are also 
available for any interested user as part of the same dataset. 


Production method 


The LBA LUC map was produced after unsupervised clas- 
sification of AVHRR imagery, postprocessing and labelling 
of the classification results. Different sources of auxiliary 
data were used in the production of the dataset to overcome 
the limitations of the imagery, including a Global Vegetation 
Index (GVT) layer, the UNESCO’s Vegetation Map of South 
America, the Hueck's Vegetationsskarte Von Sudamerika 
and a potential vegetation map of South America based on 
the Holdridge bioclimatic scheme. 


Production description 


Users can download the LUC map as a single raster file 
including the LUC information or as part of a data package 
including all the products produced within the LBA project. 
As part of these, we find different vegetation and soil maps 
for Brazil. In all cases, the download only includes the raster 
files and no auxiliary information is provided. 


Downloads 


RAR folder with all products 


— Raster file with LUC map 

— 3 raster files with Brazil vegetation maps at different levels of 
thematic resolution 

— 3 raster files with Brazil soil maps at different levels of thematic 
resolution 


SA lc Map 4lclass.tif 
— Raster file with LUC map 


Legend and codification 


Code 


0 


10 


11 


12 


13 


14 


15 
16 


17 


18 


19 


20 


Label 
Off Map (Fill Value) 


Tropical moist and 
semi-deciduous 
forest 


Cleared tropical 
moist Forest 


Unclassified 


Water 


Savanna/Grasslands 


Wet 
vegetation/Mixed 


Unclassified 
Mangroves 


Seasonally 
deciduous 
Woodlands. 


Forest (Bamboo 
dominated?) 


Secondary tropical 
moist forest with 
agriculture 


Pantanal grassland 
(seasonally flooded) 


Tropical seasonal or 
deciduous forest 


Agriculture 


Gallery forests 


Tropical open forests 
(mixed) 


Cerrado (woodlands) 
degraded 


Grasslands or 
Savanna with 
agriculture 
Xerophytic 
woodlands with 
agriculture 


Degraded xerophytic 
woodlands 


Code 
21 


22 


23 


24 


25 


26 


27 


28 


29 


30 


31 


32 


33 


34 


35 


36 
37 


38 


39 


40 


41 
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Label 


Secondary seasonal 
forest with agriculture 


Urban and degraded 
lands 


Degraded tropical 
seasonal forest 


Mixed pine forest with 
secondary forest and 
agriculture 


Xerophytic scrubland 


Xerophytic littoral 
vegetation 


Montane grassland 


Montane woodlands 
Montane forests 


Degraded montane 
grasslands 


Degraded montane 
woodlands 


Degraded montane 
forests 


Cool deciduous 
shrublands 


Bare soil/Rock 
Cool deciduous 
woodlands 


Cool deciduous forests 


Snow/Rock 


Salt marsh community 


Desert 


Degraded temperate 
deciduous forest 


Temperate deciduous 
forests 
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2 NALCMS—North American Land Change Monitoring System 


Product 

LULC general 
Dates 

2005, 2010, 2015 
Formats 

Raster 

Pixel size 

30 m (2010, 2015) 
250 m (2005, 2010) 
Thematic resolution 


19 classes: 
1 (a), 1 (ag), 13 (v), 1 (m), O (na) 


Compatible legends 
FAO-LCCS 

Extent 

North America 
Updating 

Unknown 

Change detection 


Through change layers 


Overall accuracy 
Expected to be 279.996 
Website of reference Website Language English, Spanish, French 
http://www.cec.org/north-american-land-change-monitoring-system/ 
Download site 
http://www.cec.org/north-american-land-change-monitoring-system/ 
Availability Format(s) 
Open Access .tiff, img, .mxd 
Technical documentation 
Colditz et al. (2012, 2014a, b, c), Gebhardt et al. (2014), Homer et al. (2015), Jin et al. (2013, 2019), Latifovic et al. (2012, 2017) 
Other references of interest 
Yang et al (2018) 
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Project 


The NALCMS project started in 2006 fruit of the collabo- 
ration between the following Canadian, American and 
Mexican institutions: the Natural Resources Canada/Canada 
Centre for Remote Sensing (NRCan/CCRS), the United 
States Geological Survey (USGS) and the Mexican National 
Institute of Statistics and Geography (INEGI), National 
Commission for the Knowledge and Use of Biodiversity 
(CONABIO) and the National Forestry Commission of 
Mexico (CONAFOR). The project is also supported by the 
Commission for Environmental Cooperation (CEC), a body 
comprising all three North American countries. 

The objective of the project was to create a homogeneous, 
coherent LUC dataset for North America that could be used 
for environmental monitoring at a continental scale, and 
which also addressed the needs and requirements of scien- 
tific and policy-making communities. Each country pro- 
duced its own LUC map according to its needs and 
requirements. The purpose of the project was to coordinate 
the homogenization and harmonization of these national 
maps to create a single map of the whole North America. 

Since it was launched in 2006, three LUC maps have 
been produced. Important improvements have been made 
over time. The most significant change was the improved 
spatial resolution of 30 m applied in the latest maps, com- 
pared to 250 m in the first edition. 


Production method 


There is no single production methodology for NALCMS. 
Each country is responsible for producing its own LUC map, 
according to its particular needs and interests. 

The first edition of the product for 2005 was created via a 
classification of MODIS imagery at 250 m following a 
similar workflow for the three countries. In 2010, the initial 
map for 2005 at 250 m was revised, mapping only the LUC 
changes that happened over the period 2005-2010. LUC 
changes for Hawaii were not mapped in this update. Mapped 
changes were individually distributed through a specific 
change layer at 250 m for the period 2005-2010. 

For 2015, Canada and the USA obtained their respective 
LUC maps after classification of Landsat imagery, while 
Mexico obtained its map via the classification of RapidEye 
(5 m) imagery resampled at 30 m. Whereas for Canada and 
Mexico the imagery mostly dates from 2015, most of the 
imagery used in the US map was from the year 2016. For 
2010, the three countries obtained the map at 30 m from the 
classification of Landsat imagery. However, whereas most of 
the imagery for Canada and Mexico was captured in 2010, 
the images used to map USA were taken in 2011. 

A change layer at 30 m for the period 2010-2015 was 
obtained by comparing the base LUC maps at the two different 
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dates for Canada and USA. In Mexico, because different 
imagery sources had been used for the different reference 
years, the changes were individually extracted from Landsat 
imagery based on an independent change detection algorithm. 


Product description 


NALCMS can be separately downloaded for each of the 
reference years. A change layer for each mapped period is 
also available: 2005-2010 and 2010-2015. For those years 
for which more than one spatial resolution is available, users 
can download a separate product at each resolution. 

The datasets at 250 m can be downloaded in different 
formats: GeoTIFF, ERDAS Imagine (.img), Map Exchange 
Document (.mxd) and as a georeferenced PDF file (GeoPDF). 
Datasets at 30 m are downloaded in a compressed file (.zip) in 
GeoTIFF. They can be downloaded for the whole of North 
America or individually for each of the mapped countries. 

Different auxiliary information is provided with each 
downloaded product. Nonetheless, the metadata for all the 
available products can be downloaded separately from the 
dataset's website. 


Downloads 


Land Cover, 2005-2010 (MODIS, 250 m), TIFF 


— Raster files with North America and Hawaii LUC maps (.tiff) 
— Metadata file (.doc) 

— Definitions of the different classes (.doc) [Only 2010 map] 

— Press release presenting the product (.doc) 

— Terms of use of the product (.doc) 


Land Cover Change, 2005-2010 (MODIS, 250 m), TIFF 


— Raster files with LUC changes (.tiff) 

— Layer style file for ArcGIS (lyr) 

— Cross tabulation matrixes of change (in ha, percent and pixels) at 
two different classification schemes (.xlsx) 

— Metadata file (.doc)— Press release presenting the product (.doc) 

— Terms of use of the product (.doc) 


Land Cover, 2010-2015 (Landsat, 30 m), North America 


— Raster file with LUC map 
— Layer style files for ArcGIS (.lyr) in English, French and Spanish 
— Metadata file (.doc) 


Land Cover Change, 2010-2015 (Landsat, 30 m), North America 


— Raster files with gains and losses for the Forest, Shrubland, 
Grassland, Wetland, Cropland, Barren Land, Urban and Built-up, 
Water and Snow and Ice categories (.tiff) 

— Raster file with LUC changes 

— Metadata file (.doc) 

— Text document with a description of the dataset (.txt) 
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Legend and codification 


The change layers include a qualitative description of the 
classes at the two different points in time. In addition, the 
pixel values are formed by combining the class code for the 
land use at point 1 in time with the class code for the new 
land use at point 2. e.g. the code 1011 refers to a pixel that 
was Temperate or sub-polar grassland (10) on the first date 
assessed and had changed to Sub-polar or polar 
shrubland-lichen-moss (11) on the second. 


Code | Label Code | Label 
1 Temperate or sub-polar 11 Sub-polar or polar 
needleleaf forest shrubland-lichen-moss 
2 Sub-polar taiga 12 Sub-polar or polar 
needleleaf forest grassland-lichen-moss 
3 Tropical or 13 Sub-polar or polar 
sub-tropical broadleaf barren-lichen-moss 
evergreen forest 
4 Tropical or 14 Wetland 
sub-tropical broadleaf 
deciduous forest 
5 Temperate or sub-polar 15 Cropland 
broadleaf deciduous 
forest 
Mixed forest 16 Barren lands 
Tropical or 17 Urban 


sub-tropical shrubland 


(continued) 
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Code | Label Code | Label 

8 Temperate or sub-polar 18 Water 
shrubland 

9 Tropical or 19 Snow and Ice 
sub-tropical grassland 

10 Temperate or sub-polar 


grassland 


Practical considerations 


Maps at 30 m and 250 m were obtained following a different 
workflow and are not comparable. The maps for Mexico for 
2010 and 2015 were obtained from different imagery sources, 
which means that changes cannot be calculated by subtract- 
ing one map from the other and should only be studied using 
the change layer distributed by the production team. 

No information is offered about the uncertainty of the 
change layers. They may be subject to important sources of 
uncertainty and may include a lot of technical or spurious 
changes that did not actually happen on the ground. 

NALCMS is one of the products in the North American 
Environmental Atlas. Users can consult the different 
NALCMS layers online, together with a lot of other relevant 
geospatial information for North America, as part of the 
Atlas website at http://www.cec.org/files/atlas/. Users can 
also download any of the displayed layers, including the 
LUC maps, from the same website. 
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3 MERISAM2009—MERIS MAP 2009/2010 South America 


Product 

LULC thematic 
Dates 

2009/10 

Formats 

Raster, Vector 

Pixel size 

300m 

Thematic resolution 


11 classes: 


0 (a), 3 (ag), 5 (v), 5 (m), 1 (na) 
Compatible legends 

None 

Extent 

South America 

Updating 

Not expected 

Change detection 


No (only one date) 


Overall accuracy 
Not specified 
Website of reference Website Language English 
Not available 
Download site 
Not available 
Availability Format(s) 
On request img, .shp 
Technical documentation 
Hojas-Gascon et al. (2012) 


Other references of interest 
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Project 


MERISAM is a map developed by the Joint Research Centre 
(JRC) of the European Commission as part of the regional 
LUC mapping efforts for South America. With the produc- 
tion of MERISAM, the JRC team aimed to overcome some 
of the limitations encountered during the production of 
GlobCover for South America. These referred mainly to 
spatial and thematic inaccuracies due to the limited number 
of MERIS images acquired and the method followed to 
produce the imagery mosaic required to carry out the 
classification. 

The MERISAM dataset was used to assess LUC change 
in the first decade of the 21st century by comparing it with 
the GLC2000 dataset. 


Production method 


MERISAM was obtained after unsupervised classification of 
MERIS imagery for the period 2008-2010 using the ISO- 
DATA classification algorithm, which identified 100 differ- 
ent spectral classes. These were manually assigned to 6 LUC 
categories based on the information provided by auxiliary 
datasets, such as national vegetation maps and Google Earth 
imagery. FAPAR data, which provide information on the 
photosynthetic activity of the vegetation, were also used as 
auxiliary information to disaggregate the initial set of LUC 
categories. 


Product description 


Interested users can access this dataset by contacting the JRC 
team that produced it. The dataset includes the LUC map in 
two formats: raster (.img) and vector (.shp). The vector file 
was obtained by vectorizing the original raster file. 
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Downloads 
MERISAM2009 
— Raster file with LUC map (.img) 


— Vector file with LUC map (.shp) 
— Two versions of the scientific paper presenting the dataset (.pdf) 


Legend and codification 


Here are the codes used to produce the raster version of the 
map. 


Code | Label Code Label 

1 Evergreen forest 6 Sparse and barren 

2 Dry forest and shrubs 10 Inland water 
Dry open forest and 41 Grasslands and 
shrubs shrubs 

4 Grasslands 51 Agriculture mosaic 
Agriculture and 52 Agriculture 
pasture intensive 

0 Background 


Practical considerations 


This dataset is not directly available for download. Users 
wishing to access it must contact the JRC team that produced 
it (Hugh.EVA Gec.europa.eu, Rene.BEUCHLE Gec.europa. 
eu). 

Although the dataset has been used to assess LUC 
changes by comparing it with GLC2000, this exercise has 
many limitations and uncertainties and is therefore not 
recommended. 
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4 The Himalaya Regional Land Cover Database 


Product 
LULC general 
Dates 


2000 (base LUC map) 
1970-2007 (LUC changes) 


Formats 

Vector 

Scale 

1:350,000 

Thematic resolution 


35 classes: 
1 (a), 7(ag), 15 (v), 7 (m), 0 (na) 


Compatible legends 
LCCS 

Extent 

Himalaya region 
Updating 

No 


Change detection 


Yes, through the change layer 
Overall accuracy 

Not specified 

Website of reference Website Language English 
http://www.fao.org/geonetwork/srv/en/main.home?uuid-46d3c2ef-72c3-4196-8e32-40723cd1847b 
Download site 
http://www.fao.org/geonetwork/srv/en/main.home?uuid-46d3c2ef-72c3-4196-8e32-40723cd1847b 
Availability Format(s) 

Open Access .shp 

Technical documentation 


Other references of interest 
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Project 


The Himalaya Regional Land Cover database was developed 
within the context of the Global Land Cover Network— 
Regional Harmonization Programme, promoted by the Food 
and Agriculture Organization of the United Nations 
(FAO) and UN Environment in collaboration with the 
Geographic Information for Sustainable Development 
(GISD) global partnership. The programme aimed to pro- 
duce reliable, harmonized global land cover information, 
providing guidance and methodologies for the production of 
LUC information at national, regional and global levels. 


Production method 


The database was obtained by automatic segmentation of 
Landsat imagery for the reference year 2000 plus visual 
interpretation. The initial classification was refined by 
interpreting high resolution imagery from Google Earth. 

A layer of LUC changes was obtained by assessing the 
base map (2000) against historical imagery for the periods 
1970-80, 1990 and 2007. No maps for the other years of 
reference are available, but only the respective layers of 
changes. 


Database 


Himalaya Regional Land Cover Database 
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Product description 


The database is distributed at regional level in vector format 
for each of the countries and regions that make up the 
Himalayan region: Afghanistan, Bhutan, China-Yunnan 
Sheng, China-Xizang Zizhiqu, India, Nepal, Pakistan, 
Aksai Chin, Arunachal Pradesh, China/India, Jammu 
Kashmir and Myanmar. An additional vector layer with 
LUC changes for the period 1970-2007 is also included. The 
downloaded products consist solely of the vector layers with 
LUC data. No other auxiliary information is provided with 
the downloaded file. 

A detailed legend for the product can be downloaded 
separately in Excel or mdb formats. A layer with the 
boundaries of the region and its administrative units is also 
available for download. 


Downloads 


Land Cover map (country/region) 


— Vector file with Land Cover map (.shp) 


Land change Himalaya region 


— Vector file with map of Land Cover changes (.shp) 
— Vector file with boundaries of the Himalaya region (.shp) 


Z007CODE ZOO7USLB ZOO7PERC HECTARES AREA AGG ZONE CODE1 CODE2 BOOLEAN1 BOOLEAN2 LCCSMAIN1 LCCSMAIN2 AUTO ID ^ 
1 20059-122... 2HS//6BR 100 97714.744 977147... BS Zone 44 2HS//6BR A2A14B4-B13 / 2//6 221550 
2 20377-120... 2SOd 100 51050.075 510500... H&S Zone 44 2SOd A4SA11B3XXXXX... N 2 221580 
3 20059-122... 2HS//6BR 100 48652.191 486521... BS Zone44  2HS//6BR A2A14B4-B13 /... / 2//6 222894 


— 2007CODE: LUC Code 

— ZOO7USLB: LUC User Label 

— Z007PERC: Percentage of the LUC(s) making up the polygon 
— HECTARES: Area of the polygon, in hectares 
— AREA: Area of the polygon, in square meters 
- AGG 

— ZONE: UTM Zone 

— CODE 1: Code LUC 1 

— CODE 2: Code LUC 2 

— BOOLEANI: LUC Label 1 

— BOOLEAN 2: LUC Label 2 

— LCCSMAINI: Main LUC 1 

— LCCSMAIN2: Main LUC 2 

— AUTO ID: Unique identifier for each polygon 
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Legend and codification 


Code 

1H 

1HI 

IT 

IS 

1HSs 
1HLMv 
1HSv 
2HCO 
2HS 
2HS//6BR 
2HCO//1H 


2SCO 
28S 
2SSd 
2SOd 


2TCOne// 
2TCObe 


2TCOne 
2TCObe 


2TSne// 
2TSbe 


2TSne 
2TSbe 
4HCOp 
4SCOs 
5UI 
6BR 
6S 
6GR 
8ICE 
8ICEr 
8SN 
8SNs 
8WNP 
8WBS 
8WP 
8WF 


Label 

Herbaceous Crops 

Irrigated Herbaceous Crops 

Tree Crop 

Tea Crop 

Small Herbaceous Crops in sloping land 
Large to Medium Herbaceous Crops in valley floor 
Small Herbaceous Crops in valley floor 
Closed to Open Herbaceous 

Sparse Herbaceous 

Sparse Herbaceous OR Bare Rock 


Closed to Open Herbaceous OR Rainfed 
Herbaceous Crops 


Closed to Open Shrubs 

Sparse Shrubs with Sparse Herbaceous 
Sparse Dwarf Shrubs with Sparse Herbaceous 
Open Dwarf Shrubs with Sparse Herbaceous 


Closed to Open Needleleaved Trees OR Closed to 
Open Broadleaved Trees 


Closed to Open Needleleaved Trees 
Closed to Open Broadleaved Trees 


Sparse Needleleaved Trees OR Sparse Broadleaved 
Trees 


Sparse Needleleaved Trees 

Sparse Broadleaved Trees 

Closed to Open Permanently Flooded Herbaceous 
Closed to Open Seasonally Flooded Shrubs 
Urban and Industrial Areas 

Bare Rock 

Bare Soil 

Rock Debris 

Glacier 

Rocky Glacier 

Perennial Snow 

Seasonal Snow 

Non-Perennial Lakes 

Bare Soil in seasonally flooded area 

Lakes 


Rivers 
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Abstract 


Vegetation covers were one of the first land covers to receive 
special attention when thematic Land Use Cover 
(LUC) maps first appeared. Interest in this subject has 
remained strong since then because of the valuable infor- 
mation that these datasets provide for monitoring forests, 
deforestation and climate change, among other issues. 
A wide variety of thematic LUC datasets characterizing 
vegetation covers are currently available. In this chapter, we 
review eleven of these datasets, most of which provide long 
series of LUC maps, so permitting the study of LUC change. 
In thematic terms, most of the maps provide information on 
the vegetation or tree cover fraction per pixel, so character- 
izing the vegetation covers on Earth in great detail. A specific 
dataset has been found that maps mangrove distribution 
across the globe at 30 m for one date (1997/00). It is not 
included in this review because of its high specificity, which 
means it is only of interest to certain communities of users. 
Of all the products reviewed here, the World’s Forests 2000 
is probably the most basic, providing information about 
three wooded cover categories for the year 1995/96 at a 
spatial resolution of 1 km. SYNMAP is a very specific 
thematic map designed to meet the needs of the carbon cycle 
and vegetation modelling community, which was produced 
at a spatial resolution of 1 km and with a legend of 48 
categories. Among the maps providing information on the 
fraction of vegetation cover per pixel, the Hybrid Forest 
Mask 2000 (1 km) and the PTC Global Version (500 m- 
1 km) offer relatively coarse resolutions and few points in 
time: just one date in the former (2000) and two in the latter 
(2003, 2008). The Forests of the World 2010 is also 


available for just one year (2010), albeit at a more detailed 
spatial resolution (250 m). Various datasets provide infor- 
mation on the cover fraction for long periods of time at 
medium and high spatial resolutions. FCover provides the 
longest time series (1999-present) at 1 km, although since 
2014 this dataset is also available at 300 m. Modis VCF also 
offers a long data series (2000-2019) at a spatial resolution 
of 250m. MEaSUREs Vegetation Continuous Fields 
(VCF) is another thematic LUC dataset providing informa- 
tion on the tree cover fraction of the earth surface for a very 
long time period: 1982-2016. However, it is not reviewed 
here because of its coarse spatial resolution (around 5.6 km 
at the Equator) At very detailed spatial resolutions, 
GFCC30TC Landsat VCF (30 m) provides data on the 
cover fraction for four different points in time, between 2000 
and 2015. It also gives information on forest change for two 
periods (1990—2000/2000-2005) through the associated 
GFCC30FCC dataset. The Hansen forest map (30 m) also 
provides one of the longest time series, from 2000 to 2019. 
Global FNF is the dataset with the highest resolution (25 m) 
of all those reviewed. It is available for two periods of time: 
2007-2010 and 2015-2017. In thematic terms, however, 
this dataset is less detailed, in that it only differentiates 
between forest and non-forest covers. TanDEM-X 
Forest/Non-Forest also provides information on the forest 
extent at high spatial resolution (50 m). However, the map is 
only available for one point in time. Like Global FNF, it was 
also obtained from the classification of radar data. 
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1 The World's Forests 2000 


Website of reference 


Product 

LULC thematic 
Dates 

1995 / 96 

Formats 

Raster 

Pixel size 

1 km 

Theme 

3 forest categories out of 6 
Extent 

Global 

Updating 

No 

Change detection 
No (only one date) 
Overall accuracy 
Expected to be >80% 


Website Language English 


http://www.fao.org/forest-resources-assessment/past-assessments/fra-2000/en/ 


Download site 


http://www.fao.org/geonetwork/srv/en/main.home?uuid-b9f2ee20-88fd-11da-a88f-000d939bc5d8 


Availability 

Open Access 

Technical documentation 
FAO (2000, 2001) 


Other references of interest 


Format(s) 
.adf 
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Global Thematic Land Use Cover Datasets ... 
Project 


The World's Forests 2000 map was one of the products 
generated within the context of the Global Forest Resources 
Assessment (FRA) for the year 2000. FRA is a project run 
by the Food and Agriculture Organization (FAO) that dates 
back to the year 1946. A new edition is issued every five 
years on average. 

The project, which is carried out in collaboration with the 
different countries that form part of the FAO, aims to assess 
the state of the world's forests and understand the changes 
that they undergo over time. Satellite imagery and remote 
sensing techniques were used for the first time in the 
FRA2000 survey. A global map of forests was produced as 
part of the project. The U.S. Geological Survey (USGS) 
EROS Data Center (EDC) was in charge of map production. 
Two extra maps were also produced as part of the project: an 
ecological zoning map and a map of protected forests. 


Production method 


The World's Forests 2000 map was produced in two stages. 
In the first stage, closed forest and open or fragmented forest 
categories were mapped on the basis of a classification of 
AVHRR imagery for the period 1995-1996. A complex 
methodology based on a mixture analysis model and a 
geographical stratification to account for regional variation 
in the mapped features was employed to calculate the frac- 
tion cover per pixel. The two LUC categories were extracted 
from these layers based on the tree cover percentages defined 
by the FAO: 40-100% for closed forest and 10—40% for 
open or fragmented forest. 

In the second stage, the Global Land Cover Characteris- 
tics Database (GLCC), obtained from a classification of 
AVHRR imagery for the period 1992/03, was used to map 
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the remaining categories: other wooded land, other land 
cover and water. The fact that the different input data 
(AVHRR and GLCC) had different reference dates led to 
temporal inconsistency between forest and non-forest 
categories. 

Some auxiliary datasets were also used in the production 
of the map, such as ecoregion maps and digital elevation 
models. These helped to merge and split the different cate- 
gories being mapped. 


Product description 


The map can be downloaded as a zipped file containing the 
raster with the LUC information and other auxiliary infor- 
mation. The download includes two versions of the LUC 
map, one classifying the land covers in a range of values 
from 1 to 6 and the other classifying the land covers in a 
range of values from 100 to 600. 


Downloads 


The World's Forests 2000 


— Raster file with LUC map (for 2000) 

— Raster file with LUC map (info, forest) 

— Preview image of the product 

— ArcGIS file (.avl) with symbology for the raster 


Legend and codification 


Code Label Code Label 

1/100 Closed forest 4/400 Other land cover 
2/200 Open or fragmented forest 5/500 Water 

3/300 Other wooded land 6/600 Undefined 
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2 FCover—Fraction of Green Vegetation Cover 


| Product 
LULC thematic 


Every 10 days from 1999 to 2020 (1 km) 
Every 10 days from 2014 to the present (300 m) 


Raster 
| Pixel size 

300 m, 

1 km 
| Theme 

Percentage of vegetation cover 
Extent 
Global 

Updating 

Expected, but no specific date 
Change detection 

Supported via specific layers of forest. change 


Overall accuracy 


N ot specified. 


Website of reference Website Language English 

l https://land.copernicus.eu/ global/products/fcover 

Download site 

l https :/Nand.copernicus .vgt.vito.be/PDF/portal/Application.htmli Browse ;Root-5 1 2260;Collection=1 00006 al -Time=NORMAL,NORMAL,- 1L 
ve M — ———e 12260;Collection=100008 1;Time=NORMAL,NORMAL, |, 
JANUARY,2014,31,DECEMBER,2020;isReserved=false (1 km) 
‘Availability = 


— u 
Format(s) 
Open Access under registration nc 


l Technical documentation 


Baret et al. (2016), Jolivet (2020), Lacaze et al. (2020), Martinez-Sanchez and Sánchez-Zapero (2020), Ramon et al. (2020), Sánchez-Zapero - 
et al. (2018), Smets et al. (2018), Toté and Tansey (2020), Verger (2020), Wolfs et al. (2020) 


Other references of interest 


Global Thematic Land Use Cover Datasets ... 
Project 


The Fraction of Vegetation Cover (FCover) is a product 
developed as part of the Copernicus programme, which is 
led and coordinated by the European Commission. The 
Copernicus Global Land Service (CGLS) aims to provide 
bio-geophysical land information to monitor the status and 
evolution of land surface across the globe. FCover provides 
information on the fraction of the ground surface that is 
covered by green vegetation. 

FCover is jointly produced with two other products, which 
also help to characterize the vegetation cover on Earth: the 
Leaf Area Index (LAI) and the Fraction of Absorbed Pho- 
tosynthetically Active Radiation (FAPAR). AII three were 
initially produced at a spatial resolution of 1 km, although a 
finer version of the product has recently been developed at 
300 m. There are two versions of the 1 km product. The 
second version is an improved version of the first. 


Production method 


FCover is obtained after processing satellite imagery using a 
neuronal networks method, which has been successively 
improved in the different versions of the product. 

PROV A-V imagery is used to create the product with a 
spatial resolution of 300 m. The product at 1 km also makes 
use of imagery from the VEGETATION sensor to increase 
the coverage over time. In both cases, various different 
techniques (smoothing, gap filling and temporal composit- 
ing) are applied to ensure the temporal consistency of the 
product time series. 


Product description 


The different versions of FCover at spatial resolutions of 
] km and 300 m can be downloaded from the same website. 
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In all cases, the product is distributed in single files covering 
the whole world for each period of 10 days. 

The product is delivered in the same format regardless of 
the particular version and/or spatial resolution chosen. It 
contains a raster with the LUC information, a preview pic- 
ture of the product and technical information regarding the 
creation process. The raster includes information on the 
vegetation cover fraction, plus a series of technical param- 
eters: uncertainty on the FCover, a quality flag, etc. 


Downloads 


FCover 300 m/1 km 


— Raster file with LUC map in netCDF4 format (.nc) 
— A metadata file (.xml) 

— Preview image of the product (.tiff) 

— PDFs with technical information about the product 


Legend and codification 


Code 
0-100 


Label 


Vegetation fraction cover (0—1.0) 


Practical considerations 


This is a thematically rich, complex product that some 
users may find hard to understand at first glance. Nonethe- 
less, the product's website includes all the relevant infor- 
mation to enable users to apply the product correctly and 
understand its characteristics. We therefore recommend 
users to visit the website before taking a look at the technical 
documents. 
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3 Hybrid Forest Mask 2000 


Website of reference 

Not available 

Download site 
https://application.geo-wiki.org/Application/index.php 
Availability 

Open Access under registration 

Technical documentation 

Schepaschenko et al. (2015) 

Other references of interest 

FAO (2010) 


Product 

LULC thematic 
Dates 

2000 

Formats 

Raster 

Pixel size 

] km 

Theme 

Percentage of forest cover 
Extent 

Global 

Updating 

Not expected 
Change detection 
No (only one date) 


Overall accuracy 


D. García-Álvarez and J. Lara Hinojosa 


Expected to be >=85% and up to 93% 


Website Language English 


Format(s) 


.tff,.img 


Global Thematic Land Use Cover Datasets ... 
Project 


Researchers from several institutions across the world joined 
this project to produce a forest mask for the reference year 
2000 by data fusion. The purpose was to create a new LUC 
map that charted the extent of forests at a global level and 
outperformed previous maps of a similar nature. The 
resulting map is consistent with FAO national forest 
statistics. 

This is one of many projects that have benefited from the 
Geo-Wiki platform through which crowdsourced data were 
collected for use in the production of the map. 


Production method 


The forest map was produced by merging different LUC 
databases at global (GLC2000, GLCNMO, GlobCover, 
MODIS LC, MODIS VCF, Landsat VCF, Hansen Forest 
map) and regional (Congo Basin forest types map, 
Brazil PRODES forest mask, ALUM, Pan-European 
Forest/Non-Forest Map, NLCD 2006, Land cover of Rus- 
sia, Forest mask for European Russia) scales. Although the 
reference year for the Hybrid Forest Mask is 2000, many of 
the input maps refer to different years. 

The input maps were combined using a Geographical 
Weighted Regression (GWR) algorithm that produced two 
intermediate layers: a map of forest probability and a map of 
percentage forest cover. Reference points collected through 
crowdsourcing campaigns were used to train the GWR 
algorithm and validate the maps obtained. 

From the two intermediate layers obtained, three maps 
were finally created. The first map indicates the percentage 
of forest cover in pixels with a probability of being forest of 
more than 0.5. For the second map, the pixels with the 
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highest probability of being forest were selected until the 
number of pixels determined according to the FAO FRA 
national statistics were reached. The third map was obtained 
by repeating the same procedure using regional statistics. 


Product description 

Each of the three maps produced by this project can be 
independently downloaded. In all cases, the download con- 
tains just one file about the LUC layer, with no auxiliary 
information. 

Downloads 

Hybrid Forest mask 2000-Best guess/FAO FRA national 


statistics/FAO FRA regional statistics 


— Raster file with information on tree canopy cover for the year 2000 


Legend and codification 


Code Label 
0-100 Forest Coverage (0-100%) 
128 Non forest cover 


Practical considerations 


The maps can be accessed online through the viewer 
included in the Geo-Wiki platform. Users should be aware 
that although the reference year for the product is 2000, it 
was obtained by merging products with different reference 
years. This map is therefore unsuitable for land change 
analysis. 
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4 SYNMAP Global Potential Vegetation 


Product 

LULC thematic 

Dates 

2000 

Formats 

Raster 

Pixel size 

1 km 

Theme 

43 vegetation categories out of 48 
Compatible legends 
GLCC, GLC2000, MODIS 
Extent 

Global 

Updating 

Not expected 

Change detection 

No (only one date) 


Overall accuracy 


Not specified 


Website of reference Website Language English, Spanish 
https://databasin.org/datasets/112a942ec4294e5284e63dSe6bf14b29 

Download site 

https://webmap.ornl.gov/wcsdown/dataset.jsp?ds id-10024 

Availability Format(s) 

Open Access under registration .nc, .tiff, .xyz, .nitf, img, .asc 
Technical documentation 

Jung et al. (2006) 


Other references of interest 


Global Thematic Land Use Cover Datasets ... 


Project 


SYNMAP is a dataset produced by German researchers from 
the University of Jena. It was developed to meet the 
requirements of carbon cycle and vegetation models. To this 
end, all the classes in the dataset were defined in terms of 
plant functional type mixtures, with information about the 
type of tree leaf and its longevity. The dataset was obtained 
by merging data from existing global LUC products. 


Production method 


SYNMAP was obtained by merging GLCC, MODIS Land 
Cover and GLC2000. From GLCC and MODIS Land Cover, 
two different classification schemes were used: USGS and 
IGBPP for GLCC and PFT and IGBP for MODIS Land 
Cover. The tree classes obtained after merging the previous 
maps were complemented with information about leaf type 
and phenology from AVHRR-CFTC (Continuous Fields of 
Tree Cover). 

A specific legend adapted to the requirements of the 
carbon cycle and vegetation modelling communities was 
developed for SYNMAP. Each class in the new map was 
linked with each class in the input datasets through three 
affinity scores: one for life forms, one for leaf type and one 
for leaf longevity. AVHRR-CFTC provided auxiliary data 
regarding leaf attributes. The different maps were combined 
using fuzzy agreement to define the classes for the new map. 


Product description 

SYNMAP can be downloaded in multiple formats via a web 
application. Users must select the product corresponding to 
their geographical area of interest. The product is down- 
loaded in the form of a raster file with LUC information. 


Downloads 


SYNMAP 
— Raster file with LUC map 


Legend and codification 


Code | Label Code | Label 

0 Water 24 Mixed-broadleaf-trees and 
grasses 

1 Evergreen-needle-trees 25 Evergreen-mixed-trees and 
grasses 

2 Deciduous-needle-trees 26 Deciduous-mixed-trees 


and grasses 


(continued) 
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Code | Label Code | Label 
3 Mixed-needle-trees 27 Mixed-trees and grasses 
4 Evergreen-broadleaf-trees 28 Evergreen-needle-trees and 
crops 
5 Deciduous-broadleaf-trees 29 Deciduous-needle-trees 
and crops 
6 Mixed-broadleaf-trees 30 Mixed-needle-trees and 
crops 
7 Evergreen-mixed-trees 31 Evergreen-broadleaf-trees 
and crops 
8 Deciduous-mixed-trees 32 Deciduous-broadleaf-trees 
and crops 
9 Mixed-trees 33 Mixed-broadleaf-trees and 
crops 
10 Evergreen-needle-trees and | 34 Evergreen-mixed-trees and 
shrubs Crops 
11 Deciduous-needle-trees and | 35 Deciduous-mixed-trees 
shrubs and crops 
12 Mixed-needle-trees and 36 Mixed-trees and crops 
shrubs 
13 Evergreen-broadleaf-trees 37 Shrubs 
and shrubs 


14 Deciduous-broadleaf-trees 38 Shrubs and grasses 


and shrubs 
15 Mixed-broadleaf-trees and | 39 Shrubs and crops 
shrubs 
16 Evergreen-mixed-trees and | 40 Shrubs and barren 
shrubs 
17 Deciduous-mixed-trees and | 41 Grasses 
shrubs 
18 Mixed-trees and shrubs 42 Grasses and crops 
19 Evergreen-needle-trees and | 43 Grasses and barren 
grasses 
20 Deciduous-needle-trees and | 44 Crops 
grasses 
21 Mixed-needle-trees and 45 Barren 
grasses 
22 Evergreen-broadleaf-trees 46 Urban 


and grasses 


23 Deciduous-broadleaf-trees 47 Snow and ice 


and grasses 


Practical considerations 


SYNMAP was designed to satisfy the needs of a very 
specific community: carbon cycle and vegetation modellers. 
The dataset can be consulted online via a web application. ' 


! https://databasin.org/maps/new#datasets=1 12a942ec4294e5284e63 
d5e6bf14b29. 
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5 GFCC—Global Forest Cover Change (GFCC30TC and GFCC30FCC) 


Product 
LULC thematic 
Dates 


2000, 2005, 2010, 2015 (tree cover) 
1990-2000, 2000-2005 (forest change) 


Formats 


Raster 

Pixel size 

30m 

MMU Forest change: 0.27 ha 


Theme 


Percentage of tree cover and forest gains / losses 
Extent 

Global 

Updating 


Expected, but no date specified 


Change detection 


Yes, by comparing tree cover layers or though layer of forest changes l 


Overall accuracy 
Expected to be >88-90% 
Website of reference Website Language English 


l https://lpdaac.usgs.gov/products/gfec30tev003/ 
https://Ipdaac.usgs.gov/products/gfcc30fccv001/ 


Download site 


https://Ipdaac.usgs.gov/products/gfcc30tcv003/ 
https://Ipdaac.usgs.gov/products/gfcc30fecv001/ 


Availability | Format(s) 
Open Access | „tiff 


Technical documentation 
Sexton et al. (2013, 2016a, b) 


Other references of interest 


Global Thematic Land Use Cover Datasets ... 
Project 


Global Forest Cover Change (GFCC) is a suite of products at 
30 m providing information about tree cover, forest cover 
change, water cover and surface reflectance. The last two 
products are auxiliary datasets used in the production of the 
first two: the GFCC Tree Cover Multi-Year (GFCC30TC) 
and the GFCC Forest Cover Change Multi-Year 
(GFCC30FCC). 

These datasets were developed by the Department of 
Geographical Sciences of the University of Maryland and 
form part of the NASA Making Earth System Data Records 
for Use in Research Environments (MEaSUREs). They aim 
to provide reference information for environmental moni- 
toring and forest assessment at a global scale. 

The aim of GFCC was to overcome the limitations 
imposed by the coarse resolution of the MODIS VCF 
dataset, as many forest changes take place at finer scales than 
250 m. To this end, GFCC rescales at 30 m the information 
provided by the MODIS VCF dataset, which is described 
later on in this chapter. 

The Tree Cover layer is also known as the Landsat 
Vegetation Continuous Fields (VCF) and was initially 
launched in 2013, with updates continuing until 2016. It 
describes the state of changes in the tree cover. Forest Cover 
Change focuses on forest covers and their changes. It was 
created from the Tree Cover layer, and there is only one 
edition. 


Production method 


The GFCC Tree Cover Multi-Year Global 30m 
(GFCC30TC) was obtained by applying a model to Landsat 
reflectance imagery to rescale the MODIS VCF Tree Cover 
Layer at 30 m. The model consisted of a piecewise linear 
function of surface reflectance and temperature. Although 
Landsat imagery was available prior to the year 2000, the 
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Tree Cover layer is only available for the reference years 
2000, 2005, 2010 and 2015. This is because of the time- 
frame covered by MODIS VCF (2000-2019), which is 
essential for producing the dataset. 

In the latest version of the product, the entire Landsat 
imagery archive was employed to obtain the dataset, 
whereas in the initial versions the Landsat Global Land 
Survey collection was used. In addition, a water mask, 
specifically created from Landsat imagery through a 
classification-tree model, was used in a post-classification 
step as an auxiliary dataset for generating the Tree Cover 
layer. 

The layers of forest change (GFCC30FCC) were inde- 
pendently produced for each of the periods available (1990— 
2000 and 2000-2005) from the Tree Cover layer. First, 
forest areas were extracted by applying a specific threshold 
to the Tree Cover Layer. Then, four change categories were 
defined for the period 2000-2005 based on changes in the 
Tree Cover layer: stable forest, stable non-forest, forest gain 
and forest loss. To calculate the change for the period 1900— 
2000, a specific forest cover layer was obtained for 1990 
from Landsat imagery based on a classification-tree 
algorithm. 


Product description 


GFCC30TC and GFCC30FCC are distributed as two inde- 
pendent products. Users can download the two datasets 
through four different servers or tools: Data Pool? NASA 
Earthdata Search,’ USGS EarthExplorer* and DAAC2Disk 
Utility. 

The datasets are distributed in tiles. Users must therefore 
download the tiles that cover their area of interest. The 
online viewers provided in the NASA Earthdata Search and 
USGS EarthExplorer tools are very useful for this purpose. 
The Data Pool option also includes a preview image of the 
tile as part of the download. 


? https://Ipdaac.usgs.gov/tools/data-pool/. 

? https://Ipdaac.usgs.gov/tools/earthdata-search/. 

^ https;//Ipdaac.usgs.gov/tools/usgs-earthexplorer/. 
? https://Ipdaac.usgs.gov/tools/daac2diskscripts/. 
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Downloads 


GFCC30TC 


— Raster file with the tree cover percentage per pixel 
— Raster file with information about the LUC map error 


GFCC30FCC 


— Raster file with classes of forest change 
— Raster file with forest change probability 


Legend and codification 


GFCC30TC-Tree Cover 


Code | Label Code | Label 

0- Percent of pixel area covered by tree 211 Shadow 

100 cover (0-100) 

200 Water 220 Fill 
Value 

210 Cloud 

GFCC30FCC-Forest Cover Change Map 

Code Label Code Label 

0 No Data 11 Persistent Forest 

2 Shadow 19 Forest Loss 

3 Cloud 91 Forest Gain 

4 Water 99 Persistent Non-forest 
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GFCC30FCC-Forest Cover Change Probability 
Code Label 
0—100 Probability (0-100906) of forest change 


GFCC30TC-Tree Cover 


Code Label Code | Label 

0-100 | Percent of pixel area 211 Shadow 
covered by tree cover 
(0-100) 

200 Water 220 Fill Value 

210 Cloud 

GFCC30FCC-Forest Cover Change Map 

Code Label Code Label 

0 No Data 11 Persistent Forest 

2 Shadow 19 Forest Loss 

3 Cloud 91 Forest Gain 

4 Water 99 Persistent Non-forest 


GFCC30FCC-Forest Cover Change Probability 
Code Label 
0-100 Probability (0Q-100%) of forest change 


Global Thematic Land Use Cover Datasets ... 


6 Hansen Forest Map—Global Forest Change 2000-2019 


Website of reference 


Product 

LULC thematic 

Dates 

2000-2019 

Formats 

Raster 

Pixel size 

30 m 

Theme 

Percentage of tree cover and forest gains / losses 
Extent 

Global 

Updating 

Expected, but no date specified 

Change detection 

Supported through specific layers of forest gains and losses 
Overall accuracy 


Not specified 


Website Language English 


https://earthenginepartners.appspot.com/science-2013-global-forest/download_v1.7.html 


Download site 


https://earthenginepartners.appspot.com/science-2013-global-forest/download_v1.7.html 


Availability 

Open Access 

Technical documentation 
Hansen et al. (2013) 

Other references of interest 
Hansen et al. (2014) 


Format(s) 
„tiff 
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Project 


The Hansen forest map was named after the researcher 
leading the project that produced the dataset Matthew 
Hansen, from the University of Maryland. Notwithstanding 
this, the project is the result of collaboration between sci- 
entists from various US institutions, including the USGS. 

The database was initially released in 2013. Since then, it 
has been revised and improved on several occasions. The 
latest published version of the product is Version 1.7, which 
included significant improvements on the previous version. 
This is expected to be the first step towards the creation of 
Version 2.0 of the product. 


Production method 


Landsat imagery was pre-processed and classified using the 
Google Earth Engine to create the Hansen forest map. A de- 
cision tree classifier was used to independently produce the 
base forest map and the yearly maps of forest lost. For 
classification purposes, all vegetation taller than 5 m in 
height was considered to be a tree. Forest loss was defined as 
a stand-replacement disturbance. 


Product description 


The Hansen Global Forest Change dataset is made up of 
multiple layers. The base layer (treecover2000) provides 
information on forest cover across the world for the year 
2000. Two other layers (gain, lossyear) help to interpret the 
changes in forest cover since 2000 by identifying both the 
areas where new forest cover has appeared during this period 
and the areas in which forest cover has been lost. Forest 
cover losses are disaggregated per year. 

The product also includes an auxiliary layer which 
identifies the mapped areas, the water bodies and the areas 
with no data. Cloud-free composites of Landsat imagery for 
the product's first and last years (2000 and 2019) are also 
provided together with the LUC layers. 

The map is distributed in tiles. For this purpose, the world 
is divided into equal-size areas of 10 x 10 degrees. 


Downloads 


Tree canopy cover for year 2000 (treecover 2000) 


— Raster file with information on tree canopy cover for the year 2000 


Global forest cover gain 2000-2012 (gain) 


— Raster file with information about gains in forest cover 


Year of gross forest cover loss event (lossyear) 


— Raster file with information about the loss of forest cover 
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Data mask (datamask) 


— Raster file indicating the areas with no data, water surfaces and 
mapped land surface 


Legend and codification 


Tree canopy cover for year 2000 (treecover 2000) 
Code Label 


0-100 Tree cover area density (1—100) 


Global forest cover gain 2000-2012 (gain) 
Code Label Code Label 


0 Forest no gain 1 Forest gain 


Year of gross forest cover loss event (lossyear) 


Code Label Code Label 

0 No forest loss 10 Forest loss in 2010 
1 Forest loss in 2001 11 Forest loss in 2011 
2 Forest loss in 2002 12 Forest loss in 2012 
3 Forest loss in 2003 13 Forest loss in 2013 
4 Forest loss in 2004 14 Forest loss in 2014 
5 Forest loss in 2005 15 Forest loss in 2015 
6 Forest loss in 2006 16 Forest loss in 2016 
7 Forest loss in 2007 17 Forest loss in 2017 
8 Forest loss in 2008 18 Forest loss in 2018 
9 Forest loss in 2009 19 Forest loss in 2019 
Data mask (datamask) 

Code Label 

0 No data 

1 Mapped land surface 

2 Water bodies 


Practical considerations 


The dataset can be easily visualized and consulted through a 
web-based visualization tool.° For those who want to work 
with data for the whole Earth rather than for specific areas of 
the world (tiles), the producers provide txt files with a full 
list of download links for each of the 6 layers that make up 
the product. 

Landsat 8 imagery enabled better detection and mapping 
of forest disturbance. Some uncertainties may therefore 
emerge when comparing forest losses before and after the 
inclusion of Landsat 8 imagery. 


ê http://earthenginepartners.appspot.com/science-2013-global-forest. 


Global Thematic Land Use Cover Datasets ... 


7 MODIS Vegetation Continuous Fields—MOD44B 


Website of reference 


https://Ipdaac.usgs.gov/products/mod44bv006/ 
https://modis. gsfc.nasa. gov/data/dataprod/mod44.php 


Download site 

https://Ipdaac.usgs.gov/products/mod44bv006/ 

Availability 

Open Access under registration 

Technical documentation 

Hansen et al. (2003a, b), Townshend et al. (2011) 

Other references of interest 

Amarnath et al. (2017), Hansen et al. (2005), Jeganathan et al. (2009) 


Product 

LULC thematic 
Dates 

2000-2019 
Formats 

Raster 

Pixel size 

250 m 

Theme 
Percentage of tree cover 
Extent 

Global 

Updating 
Expected 

Change detection 
Yes 

Overall accuracy 


Not specified 


Website Language English 


Format(s) 
.hdf 
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Project 


The MODIS Vegetation Continuous Fields (VCF), also 
known as MOD44B, is a thematic LUC database developed 
by the Department of Geographical Sciences of the 
University of Maryland. This dataset was created in order to 
overcome the limitations of categorical LUC data for which 
it was necessary to define specific thresholds when charac- 
terizing vegetation cover. The team from the University of 
Maryland later applied Landsat imagery to produce a VCF 
product at finer spatial resolutions, so improving the quality 
of the information provided by this dataset. 

The dataset was initially launched in 2003. Since then, 
several versions of the product have been produced, each 
making an improvement on its predecessors. The last version 
of the product was launched in 2015 (v6). Versions 1 to 3 of 
the dataset were produced at a spatial resolution of 500 m. 
Subsequent versions were produced at 250 m. 


Production method 


MODIS VCF was obtained from MODIS imagery and other 
MODIS-related products, such as the MODIS Global 250 m 
Land/Water Map. A regression tree model was applied to the 
imagery to obtain the MODIS VCF dataset. The model was 
applied through open-access and other software customized 
for the production of the dataset. 


Product description 


MOD44B can be downloaded from different servers or tools, 
including AppEEARS, Data Pool, Nasa Earthdata Search, 
USGS EarthExplorer and OPeNDAP. In all cases, the pro- 
duct is distributed in tiles. Users must select their area of 
interest. 

The download consists of a single raster file made up of 
multiple bands, each one showing different information: 
percent of tree cover, percent of non-tree vegetation, percent 
of non-vegetation covers, and three extra bands with tech- 
nical and quality information about the product. 


Downloads 
Single mosaic 


— Raster file with multiple bands, including LUC and data quality 
information 
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Legend and codification 


Percent Tree Cover 


Code Label 

0-100 Percent tree cover (0—100) 
200 Water 

253 Fill/Outside of projection 


Percent Non-tree vegetation 


Code Label 

0-100 Percent non-tree vegetation of each pixel (1-100) 
200 Water 

253 Fill/Outside of projection 


Percent Non-vegetation cover 


Code Label 

0—100 Percent with no vegetation of each pixel (1-100) 
200 Water 

253 Fill/Outside of projection 


Percent Tree Cover Standard Deviation (SD) 
Code Label 


0—10,000 Percent with standard deviation as regards 


Percent Tree Cover layer (1—10,000) 


Percent Non-vegetation Standard Deviation (SD) 
Code Label 


0-10,000 Percent with standard deviation as regards 


Percent Non-vegetation (1—10,000) 


Practical considerations 


Users must bear in mind that although the dataset is dis- 
tributed as a single raster file, this includes multiple layers 
with different, complementary information. Nonetheless, the 
core of the product is the band storing information about the 
percentage of tree cover. The dataset can be also consulted 
online through a Web Map Service (WMS).’ 


7 https://Ipdaacgis.cr.usgs.gov/arcgis/rest/services/WMS?f=pjson. 
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8 PTC Global Version—Percent Tree Cover Global Version 


Website of reference 
https://globalmaps. github.io/ptc.html 
Download site 

https://globalmaps. github.io/ptc.html 
Availability 

Open Access 

Technical documentation 


Other references of interest 


Product 

LULC thematic 

Dates 

2003, 2008 

Formats 

Raster 

Pixel size 

1 km (2003) 

500 m (2008) 

Theme 

Percentage of tree cover 
Extent 

Global / Regional 
Updating 

No 

Change detection 
Possible, but no information is available regarding its uncertainty 
Overall accuracy 


Not specified 


Website Language English 


Format(s) 
tiff 


389 
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Project 


The Percent Tree Cover Global version is a dataset created 
within the context of the Global Mapping Project, which 
aimed to create a global reference database of geospatial 
information. The project was promoted by the International 
Steering Committee for Global Mapping (ISCGM) in 
cooperation with National Geospatial Information Authori- 
ties (NGIAs) from different countries and regions across the 
world. It came to an end in 2016, when the ISCGM decided 
to wind up the project and transfer all the data to the 
Geospatial Information Section of the United Nations. 

The PTC map was generated by a group of researchers 
from the Geospatial Information Authority of Japan 
(GSI) and Chiba University. Two versions of the map were 
produced: one for the reference year 2003 and another for 
the reference year 2008. 


Production method 

The map was obtained via the classification of MODIS 
imagery. No other information is available about how the 
PTC Global version was produced. 

Product description 

A single download containing the map for the entire globe is 


available for the year 2003. For the year 2008, the map is 
distributed in 12 different tiles. Each tile covers an area of 90 
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degrees of latitude and 60 degrees of longitude. The 
downloads only include the raster files with LUC informa- 
tion. There are no auxiliary data. 


Downloads 


PTC Global 2003/2008 


— Raster file with global tree cover 


Legend and codification 


Code Label 

0-100 Tree Coverage (0-100%) 
254 Water bodies 

255 No data 


Practical considerations 


This dataset lacks auxiliary and technical information about 
specific characteristics and possible limitations, including 
data about its accuracy. It must therefore be used with 
caution. 

General information about the Global Mapping Project 
can be found at https://www.gsi.go.jp/kankyochiri/gm_ 
report e.html. More information about the project within 
which the dataset was created can be found at this website. 
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9 FNF—Global Forest Non-Forest Map 


Product 

LULC thematic 

Dates 

2007, 2008, 2009, 2010, 2015, 2016, 2017 
Formats 

Raster 

Pixel size 

25 m, 100 m, 1 km, 0.25? 

Theme 

Forest extent 

Extent 

Global 

Updating 

Expected 

Change detection 

Possible, but no information available about its uncertainty 
Overall accuracy 

Expected to be » 8446 


Website of reference Website Language English 
https://www.eorc.jaxa.jp/ALOS/en/dataset/fnf_e.htm 

Download site 

https://www.eorc.jaxa.jp/ALOS/en/palsar_fnf/registration.htm 

Availability Format(s) 

Open Access under registration hdr 

Technical documentation 

JAXA and EORC (2019), Shimada et al (2014) 

Other references of interest 

Altunel et al. (2020) 
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Project 


The Global Forest Non-Forest map (FNF) is one of the 
datasets produced by the Earth Observation Research Center 
(EORC) and the Japan Aerospace Exploration Agency 
(JAXA) as part of the ALOS-2/ALOS Science Project. The 
project is responsible for the ALOS satellites (ALOS and 
ALOS-2) and the datasets obtained from them. 

The FNF map aims to provide a reference dataset for the 
study of deforestation and forest degradation. As the map is 
obtained from imagery captured by Synthetic Aperture 
Radar (SAR) sensors (PALSAR and PALSAR-2), it can 
monitor forest changes regardless of the weather conditions, 
which is especially useful when monitoring tropical forests. 


Production method 


The main source of information for the FNF map is imagery 
from the PALSAR and PALSAR-2 sensors, on board the 
ALOS and ALOS-2 satellites. As these sensors are radar 
sensors, image classification is based on backscattering 
intensity values. Different parameters for classification are 
used depending on the region under consideration and its 
characteristics. 

The original map is produced at 25 m and later general- 
ized at coarser resolutions: 100 m, 1 km and 0.25°. Fol- 
lowing the FAO definition, those areas of more than 0.5 ha 
covered by trees with a canopy cover of over 1096 are 
considered to be forest. 


Product description 


Users can download the FNF map for each available year at 
different spatial resolutions. However, the map at 25 m is the 
only one available for all the different years covered by the 
product. 


D. García-Álvarez and J. Lara Hinojosa 


Whereas the maps at 1 km and 0.25? can be downloaded 
as a single file covering all the globe, the FNF map at higher 
resolutions (25 m, 100 m) is split into different tiles to 
facilitate downloading. Users can download the tile for their 
particular area of interest. All downloads include the FNF 
map for the selected area as well as the satellite imagery used 
to obtain it. 


Downloads 

Global Forest Non-Forest map (FNF)—25 m/100 m/1 km/0.25° 
— Raster file with LUC map 

— Raster files with satellite imagery 

Legend and codification 


Global Forest Non-Forest map (FNF)—25 m 


Code Label Code Label 

0 No Data 2 Non-forest 

1 Forest 3 Water 

Global Forest Non-Forest map (FNF)—100 m 

Code Label Code Label 

1 Water 5 Forest (26-50%) 
3 Non-forest (0—9496) 6 Forest (51-75%) 
4 Forest (10-25%) 7 Forest (76—100%) 


Global Forest Non-Forest map (FNF)—1 km / 0.25° 


Code Label 

0-100 Forest Coverage (0O-100%) 
200 Water 

235 No Data 
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10 Forests of the World 2010 


Product 

LULC thematic 
Dates 

2010 


Formats 


Raster 


Pixel size 
250 m 


Theme 


Percentage of tree cover 
Extent 

Global 

Updating 

No 

Change detection 


No (only one date) 


Overall accuracy 


Not specified 


Website of reference Website Language English 
http://www.fao.org/geonetwork/srv/en/main.home?uuid-063720fb-79b5-44e5-832b-1c03f6b845ac 
Download site 
http://www.fao.org/geonetwork/srv/en/main.home?uuid-063720fb-79b5-44e5-832b-1c03f6b845ac 
Availability Format(s) 

Open Access .adf 

Technical documentation 


Other references of interest 
FAO (2010); Ridder (2007) FAO et al. (2009) 


394 
Project 


The Food and Agriculture Organization (FAO) carries out 
the Global Forest Resources Assessment (FRA) on average 
once every five years. The first LUC map produced for this 
project was the World's Forests 2000, described above. For 
the 2015 edition of the FRA, a new map for the reference 
year 2010 was produced. 

The Forests of the World 2010 map was produced within 
the framework of the FRA 2010 and 2015 Global Remote 
Sensing Surveys. These surveys aimed to provide comple- 
mentary information using remote sensing techniques and 
Landsat imagery, in addition to the data that was normally 
collected and analysed through the different FRA projects. 

The FRA Global Remote Sensing Surveys, carried out by 
the FAO in collaboration with the Joint Research Centre 
(JRC) of the European Commission, provided systematic 
alphanumerical information on the dynamics of forest covers 
and uses for four dates (1990, 2000, 2005, 2010) at three 
different scales: regional, ecozone and global. 

A new participatory global remote sensing survey is 
currently ongoing as part of the FRA 2020 project. 


Production method 


The Forests of the World 2010 map is partially based on the 
MODIS/Terra Vegetation Continuous Fields (VCF) product. 
Other auxiliary datasets were also employed in its produc- 
tion: water data from the Shuttle Radar Topography Mission 
(SRTM) and the MODIS global water mask; a Digital Ele- 
vation Model from the SRTM; the Global Administrative 
Unit Layer (GAUL); and a dataset of Global ecological 
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zones. No information is available about the procedure fol- 
lowed to merge this information. 


Product description 

The map is downloaded as a single zip file, which contains 
the LUC raster and a series of auxiliary files that do not, 
however, provide any extra information to the user. 


Downloads 


Forests of the world 2010 
— Raster file with LUC map (fao fra2010) 


Legend and codification 


Code 
1-100 


Label 


Percent of pixel area covered by tree cover (0-100) 


Practical considerations 


No technical information is available about the way the map 
was produced, which makes it difficult to understand its 
characteristics and potential disadvantages. As this map was 
created on the basis of information provided by the 
MODIS VCF map (see Sect. 7), there may be high corre- 
lation between the two maps. 

When downloading the data, users will find many files 
making up the LUC map. To represent the map in QGIS they 
can open any of the files in the "fao fra2010" folder. 
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11 TanDEM-X Forest/Non-Forest Map 


or 


Product 


395 


LULC thematic 


Dates 
2011 / 15 


Formats 


Raster 


Pixel size 


50m 


Theme 


Forest extent 


Extent 


Global 


Updating 


No 


Change detection 


No (only one date) 


Overall accuracy 


Expected to be >90% 


Website of reference 


Website Language English 


https://www.dlr.de/hr/en/desktopdefault.aspx/tabid- 12538/21873_read-50027/ 


Download site 


https://download.geoservice.dlr.de/FNF50/ 


Availability 


Format(s) 


Open Access 


tiff 


Technical documentation 


Bueso Bello et al. (2019), Martone et al. (2016, 2018a, b) 


Other references of interest 
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Project 


The TanDEM-X Forest/Non-Forest Map is a dataset pro- 
duced by the Microwaves and Radar Institute of the German 
Aerospace Center (DLR). It aims to provide useful infor- 
mation for environmental assessment and forest monitoring. 
Together with the Global Forest Non-Forest map, described 
earlier in this chapter, it was one of the first projects to use 
radar data for forest mapping at a global scale. Radar over- 
comes some of the limitations associated with forest mapping 
using optical sensors, in that it can provide accurate LUCC 
information regardless of the weather or daylight conditions. 

The dataset was produced within the context of the 
TanDEM-X mission. It makes use of TanDEM-X bistatic 
interferometric synthetic aperture radar (InSAR) data, 
mainly captured to produce a very precise Digital Elevation 
Model (DEM) at a global scale. 


Production method 


The TanDEM-X Forest/Non-Forest Map was obtained by 
classifying and processing interferometric synthetic aperture 
radar (InSAR) data acquired by the TanDEM-X mission 
over the period 2011-2015. The original data at 3 m was 
resampled at 50 m for the classification. It includes two full 
coverages of the Earth's surface. 

Different factors in the InSAR data were used in the 
classification of forest and non-forest areas. The most 
important of these was the volume correlation factor. It 
quantifies the amount of decorrelation caused by multiple 
scattering within a volume, which is usually due to the 
presence of vegetation. The other factors employed in the 
classification process were bistatic coherence, calibrated 
amplitude and DEM height information. 

All this information was provided as input for a fuzzy 
multi-clustering classification process at the scene level. 
Specific parameters were used for different forest types 
(tropical, temperate and boreal forest) due to differences in 
forest structure, density and tree height. 

Once the classification had been carried out for all the 
available scenes, a Forest/Non-Forest Map was obtained by 
mosaicking all the classification results. In a post- 
classification stage, the accuracy of the map was improved 
using auxiliary layers that provide information about urban 
areas, water bodies, deserts and the tree line, i.e. the virtual 
line marking the altitudes above which trees do not grow. 
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Product description 


The TanDEM-X Forest/Non-Forest Map is distributed in 
1 x 1? tiles. Users can select those within their area of 
interest via the online viewer available at the download 
website (see above). The files are also available through an 
HTTPS Web browser: https://download.geoservice.dlr.de/ 
FNF50/files/. In the latter case, users must input the latitude 
and longitude values for their specific area of interest when 
downloading the files. 

The download includes the forest/non-forest map plus 
three auxiliary layers providing technical information about 
the classification. Interested users can also download the 
product's metadata as a separate file from the download 
website. 


Downloads 


TanDEM-X Forest/Non-Forest Map 


— Raster file with forest/non-forest map 

— Raster file with coverage information (number of mosaicked 
acquisitions per pixel) 

— Raster file with the number of reliable super pixels in input 

— Raster file with the date of the most recent super pixels 

— Text file with information about the data acquisition process 

— PDF files with the product's license agreements in English and 
German 

— [mage preview of the product 


Legend and codification 


Code Label Code Label 

0 Invalid pixels and 2 Non-forested areas 
settlements 

1 Forested areas 3 Water bodies 


Practical considerations 


This dataset was produced by means of a complex produc- 
tion method that is difficult to understand for those without 
specialist knowledge of radar data. Those wishing to find out 
more about this dataset should read the guide cited in the 
specifications above and other information about the dataset 
available at https://geoservice.dlr.de/web/dataguide/fnf50/. 
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Abstract 


There is a wide variety of global thematic Land Use 
Cover (LUC) datasets characterizing agricultural covers. 
Most of them focus on cropland areas, providing 
information on their extent or the percentage of cropland 
cover on the ground. In some cases, the focus is more 
specific and they provide information on cropland 
irrigation practices. In other cases, specific maps charting 
the extension of different crops are also available. In this 
chapter, we review 8 different datasets with a spatial 
resolution of at least 1 km. There are many other datasets 
characterizing agricultural covers at coarser resolutions, 
such as the Historic Croplands Dataset, GMRCA or 
GIAM. Their coarse resolution hampers their potential 
application in practice, which is why they are not 
described in detail in this chapter. Nor do we analyse 
FROM-GC, a dataset mapping the extent of global 
cropland at 30 m, because it is not currently accessible. 
GFSAD30 has the highest resolution of all the datasets 
reviewed (30 m). It also provides some of the most 
up-to-date information (2015). However, it only charts 
the extent of cropland. As part of an associated project, 
GFSADIKCD and GFSADIKCM characterize cropland 
areas in 9 and 7 categories respectively at 1 km for 2010. 
They provide information on the irrigation status of the 
crops. GFSADIKCD and GFSADIKCM were obtained 
from data fusion. This method is commonly used in the 


production of many of the cropland datasets reviewed: 
IIASA-IFPRI cropland map, Global Synergy Cropland 
Map, Unified Cropland Layer (UCL) and ASAP Land 
Cover Masks. The IIASA-IFPRI (2005) and ASAP maps 
provide information on the proportion of cropland at a 
spatial resolution of 1 km. ASAP also includes a map on 
rangeland covers, and as such is the only dataset 
described in this chapter that maps a cover other than 
croplands. The Global Synergy Cropland Map (2010) and 
the Unified Cropland Layer (2014) also map cropland 
proportions, although they have been produced at higher 
spatial resolutions: 500 and 250 m respectively. The 
Global Cropland Extent product maps the extent of 
cropland at 250 m based on imagery from 2000-2008. 
Although thematically limited, this dataset is less affected 
by time variability, as it is based on imagery taken over a 
long period (8 years). Finally, GRIPC maps the extent of 
three types of cropland area (irrigated, rainfed and paddy 
crops) at 500 m for 2005. 
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1 Global Cropland Extent 


Website of reference 
https://glad.umd.edu/projects/croplands/globalindex.html 
Download site 
https://glad.umd.edu/projects/croplands/dataindex.html 
Availability 

Open Access 

Technical documentation 

Pittman et al. (2010) 


Other references of interest 


Product 

LULC thematic 
Dates 

2000 / 08 
Formats 

Raster 

Pixel size 

250 m 

Theme 

Cropland extent 
Extent 

Global 

Updating 

Not expected 
Change detection 
No (only one date) 
Overall accuracy 
Not specified 
Website Language English 


Format(s) 
„tiff 


D. García-Álvarez and J. Lara Hinojosa 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


The Global Cropland Extent was a map developed for the 
Global Agriculture Monitoring Project (GLAM). The pro- 
ject. promoted by NASA, the USDA, and Maryland and 
South Dakota State universities, aimed to take advantage of 
the new generation of NASA satellite observations to 
enhance the agricultural monitoring and crop-production 
estimation work carried out by the USDA Foreign Agri- 
culture Service (FAS). At the time it was produced, Global 
Cropland Extent was the highest resolution cropland map at 
global scale produced using synoptic inputs. 


Production method 


The Global Cropland Extent map was obtained after 
thresholding a crop probability layer obtained from 16-day 
composites of MODIS imagery for the period 2000-2008. 
The probability layer was generated by averaging the results 
from multiple decision-tree classifications. They were 
trained with sub-pixel data obtained from multiple sources: 
GeoCover, AfriCover, USDA, Cropland Data Layer, NLCD, 
Agriculture and Agri-Food Canada, South Africa State of the 
Environment and CLC. 

The selected threshold for differentiating between crop- 
land and non-cropland areas in the probability layer was 
decided on the basis of information from the FAS Produc- 
tion, Supply and Distribution (PSD) database. The database 
provided, per country, the median harvested area of pro- 
duction field crops (barley, corn, cotton, oats, rice, rye, 
sorghum, soybeans and wheat) for the period 2000-2008. 
The pixels with the highest cropland probability were then 
considered cropland until those area thresholds were met. In 
the European Union, the threshold was defined for the whole 
EU area rather than at country level. 


Product description 


The Global Cropland Extent map is distributed in tiles fol- 
lowing the MODIS tile grid.' To identify the file or files that 
fall within their area of interest, users must know the hori- 
zontal and vertical tile numbers that identify each area. The 
download only includes the raster file with the cropland 
information and no additional data is provided. 


! The MODIS tile grid is available at https://modis-land.gsfc.nasa.gov/ 
MODLAND grid.html. 
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The Cropland probability layer can also be downloaded 

following the same procedure. In addition, the project pro- 

vides a global mosaic at a spatial resolution of 1 km, 
merging all the tiles in one file. 


Downloads 


Global Cropland Extent h17v04 


— Raster file with cropland extent (.tiff) 


Global Cropland Probability h17v04 
— Raster file with cropland probability (.tiff) 


Legend and codification 


Global Cropland Extent 


Code Label 

0 Cropland 

1 No cropland 

254 Water 

Global Cropland Probability 

Code Label 

0 Water 

1-100 Cropland probability (1-100%) 


Practical considerations 


According to the accuracy analyses carried out by the pro- 
duction team, the Global Cropland Extent map shows 
important accuracy differences when mapping cropland 
areas. Intensive broadleaf crop regions (corn and soybean) 
are the best mapped, while wheat-growing regions and, 
especially, rice production regions, present low levels of 
accuracy. The dataset also has problems mapping cropland 
areas in regions without intensive agriculture, like Africa. 

Because of the 8-year timespan of the MODIS imagery 
used as an input for the production of the Global Cropland 
Extent, the dataset can be considered insensitive to 
inter-annual variability of cropland covers. 


402 


2 IIASA-IFPRI Cropland Map 


Website of reference 
https://geo-wiki.org/Application/index.php 
Download site 
https://geo-wiki.org/Application/index.php 
Availability 

Open Access after registration 
Technical documentation 

Fritz et al. (2015) 

Other references of interest 

Fritz et al. (2011) 


Product 

LULC thematic 

Dates 

2005 

Formats 

Raster 

Pixel size 

1 km 

Theme 

Percentage of cropland cover 
Extent 

Global 

Updating 

Not expected 

Change detection 

No (only one date) 

Overall accuracy 
Expected to be > 82% 
Website Language English 


Format(s) 


img 


D. García-Álvarez and J. Lara Hinojosa 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


The IIASA-IFPRI Cropland Map was produced by an 
international consortium of researchers led by the Interna- 
tional Institute for Applied Systems Analysis (IIASA) and 
the International Food Policy Research Institute (IFPRI). 
The project builds on the experience and the method pro- 
posed by Fritz et al. (2011) for mapping cropland areas in 
sub-Saharan Africa. It is part of a broader plan to provide 
better LUC mapping for food security studies and policies. 

The aim of the project was to improve the spatial repre- 
sentation of cropland areas by fusing existing datasets. 
Unlike previous efforts, the focus was on cropland percent- 
age instead of cropland extent. In addition, the project 
delivered the first ever global field-size map. 


Production method 


The IIASA-IFPRI Cropland Map was obtained by merging 
the cropland cover information provided by global 
(GLC2000, MODIS 2005, GlobCover), regional (CLC, 
AFRICOVER, Cropland mask for Africa) and national (14 
countries) datasets. The datasets with a spatial resolution 
finer than 1 km were resampled and combined in a common 
grid at a spatial resolution of 1 km. For those datasets that do 
not provide information about the percentage of cropland, 
and merely inform about its presence or absence, minimum, 
average and maximum percentages of cropland cover were 
assigned according to the definition. of the cropland 
categories. 

Once all the input information had been homogenized, 
the different datasets were combined in a synergy layer. The 
synergy layer defines the cropland areas according to the 
agreement of the input datasets. The combination of datasets 
was hierarchical, according to their accuracy, which was 
determined by reference data collected through the 
Geo-Wiki platform. Together with the synergy layer, three 
other layers stating the minimum, average and maximum 
cropland percentage cover were obtained by averaging the 
minimum, average and maximum cropland percentage val- 
ues from the input maps. 
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The final IIASA-IFPRI Cropland Map was obtained by 
combining the synergy and average cropland percentage 
layers with national cropland statistics provided by FAO. 
The areas with the highest probability of being cropland 
according to the synergy layer were selected until the total 
surface area for cropland according to FAO statistics for 
each country was reached. The specific area of cropland 
allocated to each pixel (e.g. 70 ha of cropland) was deter- 
mined based on the average cropland percentage cover layer. 

Finally, a visual verification with Google Earth imagery 
was carried out at the national level to correct possible 
omission errors. 


Product description 
The dataset can be downloaded as a single compressed file 
(zip), including the raster with the LUC information and an 


auxiliary file with a brief technical description of the raster 
file. 


Downloads 

IIASA-IFPRI Cropland map 

— Raster file with cropland percentage (.img) 

— Text file with technical information about the raster 


Legend and codification 


Code 
0-100 


Label 
Cropland Coverage (0-100906) 


Practical considerations 


The ITASA-IFPRI Cropland Map can be accessed online via 
the Geo-Wiki platform. The associated field-size map can be 
very useful for researchers studying food security and other 
aspects of cropland uses and practices. The field-size map 
can be downloaded and visualized at the same website as the 
Cropland map. 
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3 GRIPC—Global Rainfed, Irrigated, 
and Paddy Croplands 


Product 

LULC thematic 
Dates 

2005 


Formats 


Raster 

Pixel size 

500 m 

Theme 

3 cropland classes out of 4 
Extent 

Global 

Updating 

Not expected 
Change detection 
No (only one date) 


Overall accuracy 


Expected to be >69% 
Website of reference Website Language 
Not available English 
Download site 
http://ftp-earth.bu.edu/public/friedl/GRIPCmap/?C=S;0=A 
Availability Format(s) 
Open Access tiff 
Technical documentation 
Salmon et al. (2015) 
Other references of interest 
Liu et al. (2018) 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


Global Rainfed, Irrigated and Paddy Croplands (GRIPC) is a 
map developed by researchers from German and American 
universities, who aimed to overcome some of the limitations 
of previous datasets focusing on irrigated croplands. At the 
time it was released, the dataset offered an up-to-date rep- 
resentation of irrigated croplands across the world at the 
highest spatial resolution available. It could be useful for 
those studying agricultural productivity, agricultural 
hydrology and food security in general. 


Production method 


The GRIPC map is made up of 4 different categories. 
Uncropped areas were extracted from the non-cropland 
categories of the MODIS Land Cover database for the period 
2004—2006. Paddy croplands were independently mapped 
from different sources, such as crop inventories, due to the 
challenges involved in classifying cloudy imagery in the 
tropics. Rainfed and irrigated cropland were mapped using a 
decision-tree classification algorithm (C4.5) and the 
“boosting” machine learning technique. 

MODIS imagery was used as the input for the classifi- 
cation. Climate and agroecozones data were also used as 
auxiliary datasets. Probability layers obtained from the 
classification were combined with information from national 
and subnational cropland inventory-based datasets to finally 
map the rainfed and irrigated cropland areas. The informa- 
tion from these datasets served to define the probabilities of 
each category occupying a pixel. Then, the classification 
results were combined with these probabilities using a 
Bayes’ rule to obtain the final map. 
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Product description 


GRIPC is distributed in 273 tiles, according to the MODIS 
tile grid.” Users must consult the tiles that correspond to 
their area of interest. A lower-resolution version of the 
product, at 5 arc minutes, and a file with the main technical 
characteristics of the dataset, are also available for 
download. 


Downloads 


GRIPC h17v04 


— Raster file with cropland information (.tiff) 


Legend and codification 


Code Label Code Label 
1 Rainfed cropland 3 Paddy cropland 
2 Irrigated cropland 4 No cropland 


Practical considerations 


GRIPC does not map various important irrigated crop- 
land categories, such as deficit irrigation (irrigation occur- 
ring less than once a year), permanent crops (orchards and 
vineyards) and unharvested pastures. As there is no official 
website describing the GRIPC and its characteristics, users 
wishing to find out more about this dataset should consult 
the scientific paper in which it was presented (Pittman et al. 
2010). 


? The MODIS tile grid is available at https://modis-land.gsfc.nasa.gov/ 
MODLAND grid.html. 
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4 GFSAD1KCM and GFSAD1KCD 


Website of reference 


https://Ipdaac.usgs.gov/products/gfsad1kemv001/ 
https://Ipdaac.usgs.gov/products/gfsad1kcdv001/ 


Download site 


https://Ipdaac.usgs.gov/products/gfsad1kemv001/ 
https://Ipdaac.usgs.gov/products/gfsad1kcdv001/ 


Availability 

Open Access after registration 

Technical documentation 

Teluguntla et al. (2020), USGS EROS (2017) 


Other references of interest 


D. García-Álvarez and J. Lara Hinojosa 


Product 


! LULC thematic 


Dates 
2010 
Formats 
Raster 


Pixel size 


[1 km 


Minimum mapping unit: 0.81 ha 
Theme 


5 cropland classes out of 7, focusing on cropland extent (GFSADIKCM) 
8 cropland classes out of 10, focusing on crop dominance (GFSADIKCD) 


Extent 
Global 
Updating 


|! Not expected 


Change detection 

No (only one date) 

Overall accuracy 
Expected to be >70% 
Website Language English 


Format(s) 
tiff 


Friedl et al. (2010), Pittman et al. (2010), Portmann et al. (2010), Ramankutty et al. (2008), Thenkabail and Lyon (2009), Thenkabail et al. 
(2009), Thenkabail et al. (2010), Thenkabail et al. (2011), Thenkabail et al. (2012), Yadav and Congalton (2018), Yu et al. (2013) 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


The GFSADIKCM and GFSADIKCD datasets were created by 
NASA and the USGS within the context of the MEaSUREs 
(Making Earth System Data Records for Use in Research Envi- 
ronments) programme. MEaSUREs is one of the competitive 
programmes of the Earth Science Data Systems (ESDS), which 
aims to take full scientific advantage of NASA missions. 

MEaSUREs projects make use of data from NASA 
satellites to produce innovative products that meet the needs 
of the research community, inform policy-making and pro- 
vide a better understanding of the planet. GFSAD (Global 
Food Security Support Analysis Data) is a specific MEa- 
SUREs project focused on mapping agricultural areas to 
contribute to global food security policies. The project aims 
to improve global cropland mapping, by providing a 
methodology that can map cropland areas across the world 
quickly, consistently and accurately. 

As part of the GFSAD projects, cropland maps have been 
produced at three different spatial resolutions (1 km, 250 m 
and 30 m). The maps at 1 km and 30 m cover the whole 
globe. Various different supranational datasets are available 
at 250 m for Africa, Australia and South Asia at different 
years of reference. A similar dataset at 250 m is also avail- 
able yearly for the United States from 2001 to 2013. 

For the product at 1 km, two complementary maps were 
generated: GFSADIKCM, mapping the extent of cropland at 
a global level, and GFSADIKCD, which maps crop domi- 
nance across the world. The map at 30 m is described later in 
this chapter. 


Production method 


GFSADIKCM and GFSADIKCD were produced separately 
by aggregating different existing products. The input maps were 
first resampled at the same resolution (1 km) and later overlaid. 

GFSADIKCM was created by aggregating the maps 
produced by Thenkabail et al. (2009, 2011), Pittman et al. 
(2010), Yu et al. (2013), and Friedl et al. (2010). Cropland 
extent was obtained by agreement of these four maps. Other 
information and indicators, such as irrigation status, irriga- 
tion or rainfed dominance, were obtained from the map 
developed by Thenkabail et al. (2009, 2011). 

GFSADIKCD was created by combining the global irri- 
gated and rainfed cropland area map produced by the Inter- 
national Water Management Institute with the maps of 
dominant global crop-types produced by Ramankutty et al. 
(2008), Monfreda et al. (2008), and Portmann et al. (2010). In 
both cases, the maps were obtained from data for the period 
2007-2012. 


Product description 


GFSADIKCM and GFSADIKCD can be downloaded from 
various different servers or tools, such as Data Pool, NASA 
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Earthdata Search, USGS EarthExplorer and the DAAC2Disk 
Utility. In all cases, users download a raster file with the 
cropland information. Downloads from Data Pool also 
include a metadata file and a preview image of the product. 
Downloads 
GFSADIKCDv001 


— Raster file with crop dominance information 


GFSADIKCMv001 


— Raster file with cropland extent 


Legend and codification 


GFSADIKCD 

Code | Label 

0 Ocean or Water areas 

1 Irrigated (Wheat and Rice) 

2 Irrigated Mixed Crops 1 (Wheat, Rice, Barley, Soybeans) 

3 Irrigated Mixed Crops 2 (Wheat, Rice, Cotton, Orchards) 

4 Rainfed (Wheat, Rice, Soybeans, Sugarcane, Corn, Cassava) 

5 Rainfed (Wheat, Barley) 

6 Rainfed (Corn, Soybeans) 

7 Rainfed Mixed Crops (Wheat, Corn, Rice, Barley, Soybeans) 

8 Fractions of Mixed Crops (Wheat, Maize, Rice, Barley, 

Soybeans) 

9 Non-cropland areas 

GFSADIKCM 

Code | Label Code | Label 

0 Ocean or Water 4 Croplands, Rainfed, Minor 
areas Fragments 

1 Croplands, 3 Croplands, Rainfed, Very 
Irrigation Major Minor Fragments 

2 Croplands, 9 Non-Cropland areas 
Irrigation Minor 

3 Croplands, 


Rainfed 


Practical considerations 


GFSADIKCM and GFSADIKCD were produced indepen- 
dently for different purposes and cannot therefore be 
compared. Although GFSADIKCD provides information 
on crop dominance, it can also be used to study cropland 
extent. 

According to the authors, data about cropping intensity 
can be obtained from this product using a time-series of 
Normalized Difference Vegetation Index (NDVI) data. 
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5 Global Synergy Cropland Map 


Website of reference 


Product 

LULC thematic 

Dates 

2010 

Formats 

Raster 

Pixel size 

500 m 

Theme 

Percentage of cropland cover 
Extent 

Global 

Updating 

Not expected 

Change detection 

No (only one date) 

Overall accuracy 
Expected to be >90% 
Website Language English 


https://dataverse.harvard.edu/dataset.xhtml?persistentIdzdoi:10.7910/DVN/ZWSFAA 


Download site 


https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi: 10.79 10/DVN/ZWSFAA 


Availability 

Open Access 

Technical documentation 
Lu et al. (2020) 

tiff 

Other references of interest 
Yu et al. (2020) 


Format(s) 
„tiff 


D. García-Álvarez and J. Lara Hinojosa 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


The Global Synergy Cropland Map is a dataset created 
within the framework of the Spatial Production Allocation 
Model (SPAM), which maps agriculture production across 
the world. It is a joint effort involving different institutions 
and universities across the world: AGRIRS, IFPRI Chinese 
Academy of Agricultural Sciences and Victoria University 
of Wellington. 

The project team aimed to create a more accurate crop- 
land dataset that would be useful for agricultural monitoring 
and food security policies and studies. The obtained map is a 
critical input of SPAM. 


Production method 


A self-adapting statistics allocation model (SASAM) is used 
to generate the Global Synergy Cropland Map, using LUC 
datasets at global, supranational and national scales as input, 
as well as FAO agricultural statistics at national and sub- 
national levels. 

Two layers were generated by the model. Firstly, an 
agreement layer, which shows the level of agreement of all 
the datasets regarding the location of cropland areas, and 
secondly, an average cropland percentage layer, obtained by 
calculating the average of all the input maps. For the 
agreement layer, datasets with a higher accuracy are given 
more weight. This accuracy is based on the agreement 
between each input dataset and the FAO statistics. For the 
cropland percentage layer, the cropland category definitions 
in the input maps were translated into cropland percentages. 

The final cropland map was obtained after executing the 
SASAM model, which allocated cropland in the areas with 
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the highest probability in the agreement layer until the total 
surface area for cropland according to FAO statistics for 
each country was reached. 


Product description 


The raster file showing the cropland percentage can be 
downloaded separately. However, we recommend the full 
download, which also contains additional information about 
the dataset, such as its level of confidence. 


Downloads 


Global synergy cropland map (full download) 


— Raster file with cropland percentage (.tiff) 

— Raster file with information about the confidence level of the 
cropland map (.tiff) 

— A text file with information about the downloaded product 


Legend and codification 


Code Label 
0-1 Cropland extent percent (0-100%) 


Practical considerations 


More information about the associated SPAM project is 
available at www.mapspam.info. The website includes all 
the spatial datasets about agricultural production generated 
as part of the project. These complement the information 
provided by the cropland map reviewed here. 
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6 UCL—Unified Cropland Layer 


Product 
LULC thematic 
Dates 

2014 

Formats 
Raster 

Pixel size 

250 m 

Theme 


Percentage of cropland cover, although for some areas it only informs 
about the extent 


Extent 

Global 

Updating 

Not planned 
Change detection 


No (only one date) 


Overall accuracy 
Expected to be >83% 
Website of reference Website Language English 
https://figshare.com/articles/dataset/ucl 2014 v2 O tif/2066742 
Download site 
https://figshare.com/articles/dataset/ucl 2014 v2 O tif/2066742 
Availability Format(s) 
Open Access tiff 
Technical documentation 
Waldner et al. (2016) 


Other references of interest 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


The Unified Cropland Layer (UCL) is one of the results of 
the SIGMA (Stimulating Innovation for Global Monitoring 
of Agriculture and its Impact on the Environment in support 
of GEOGLAM ) project. SIGMA was a European funded 
project that sought to improve agricultural monitoring and 
forecasting tools, using earth observation data. The project 
was made up of 22 renowned international institutions, many 
of which were experts in agricultural monitoring. In addi- 
tion, the project was part of the European contribution to the 
Global Agricultural — Geo-Monitoring (GEOGLAM) 
initiative. 

12 of the 22 institutions involved in this project took part 
in the production of the UCL. Its aim was to enhance the 
global mapping of cropland areas, contributing to studies 
and activities assessing the current situation of cropland 
areas across the world, assessing crop land changes and 
providing new data for the production of cropland statistics. 
The UCL uses the definition of cropland proposed by the 
Joint Experiment of Crop Assessment and Monitoring 
(JECAM). 


Production method 


The UCL was obtained by combining the best available 
LUC cropland datasets for each area of the world. To this 
end, up to 49 different LUC datasets at global, regional and 
national scales were reviewed and assessed. They were 
resampled at a spatial resolution of 250 m and, when several 
dates were available, the closest to 2014 was selected. 

The best dataset was selected on the basis of a 
multi-criteria analysis considering 4 different criteria: 
(i) match between the legend and the definition of cropland 
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used by the UCL; (ii) match between the spatial resolution 
and the cropland pattern in each area; (iii) the timeliness of 
the datasets regarding the UCL year of reference (2014); and 
(iv) the confidence level of each dataset. 

Each input source was scored according to the four cri- 
teria. The scores were later reviewed by experts on the topic. 
After this review, the scores were combined to create a 
single indicator. The dataset with the highest score in this 
indicator was selected for each pixel. When the input data- 
sets provided information on the proportion of cropland, this 
information was maintained. In all other cases, the UCL only 
differentiates binarily between cropland and non-cropland 
areas. 


Product description 


The UCL download includes the raster file with the cropland 
information, as well as a preview image of the product and 
the technical paper describing the map. Each file can also be 
downloaded independently. 


Downloads 
Unified Cropland Layer 
— Raster file with cropland information (.tiff) 


— Preview image of the map (.png) 
— Paper describing the map 


Legend and codification 


Code 
0-100 


Label 
Cropland proportion (0-100906) 
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7 GFSAD30 Cropland Extent 


Product 

LULC thematic 

Dates 

2015 (2010 for North America) 
Formats 

Raster 

Pixel size 


Theme 

Extent of Cropland 
Extent 

Global 

Updating 

Not expected 
Change detection 
No (only one date) 


Overall accuracy 

Expected to be > 91% 
Website of reference | Website Language English 
https://www.usgs. gov/centers/wgsc/science/, global-food-security-support-analysis-data-30-m- gfsad 
Download site 


l https://croplands.org/ 
https://croplands.org/downloadLPDAAC 
Availability Format(s) 
Open Access „tiff 


Other references of interest 
Teluguntla et al. (2015) 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


Global Food Security-support Analysis Data 30 metre 
(GFSAD30) was a project aimed at producing 
high-resolution cropland maps to inform global food and 
water security studies and policies. The project sought to 
overcome some of the limitations presented by previous 
cropland datasets, such as sources of uncertainty, insufficient 
precision in the allocation of cropped areas, and a lack of 
information regarding the intensity and irrigation status of 
cropland areas. 

GFSAD30 was the continuation of earlier projects (the 
GFSADIKCM and GFSADIKCD datasets described above) 
with similar purposes. They all formed part of the MEa- 
SUREs (Making Earth System Data Records for Use in 
Research Environments) programme, which promotes the 
use of data from NASA missions to produce innovative 
products that are useful for research and policy-making. 

Various different US institutions (USGS, BAER Institute, 
U.S. Department of Agriculture, U.S. Environmental Pro- 
tection Agency) and universities (New Hampshire, Califor- 
nia, Wisconsin, Northern Arizona) took part in the project, 
together with Google and institutions from other countries 
(ICRISAT, IAARD). 

A global map of cropland extent at a spatial resolution of 
30 m for the reference year 2015 was delivered as part of the 
project. The global map was obtained after merging different 
maps that had been independently produced for seven dif- 
ferent regions across the world. The map for North America 
was produced for the reference year 2010, instead of 2015. 


Production method 


GFSAD30 is made up of 7 datasets which were indepen- 
dently produced for different regions across the world: 
Europe, Middle East, Russia and Central Asia; Africa; 
Australia, New Zealand, China, and Mongolia; Southeast 
and Northeast Asia; North America; and South America. 
Each dataset was produced following a specific production 
method, although they all share certain common features. 
The same imagery source (Landsat) was used for all 7 
datasets. Sentinel-2 imagery was also used to map the extent 
of cropland in Africa. Other auxiliary data, such as elevation 
data from the SRTM radar, were used for the production of 
several datasets. In all cases, the extent of cropland was 
computed using the Google Earth Engine (GEE) platform. 
The classification workflow varies in each case. The most 
frequent classification method was the random forest algo- 
rithm. For some datasets, like Africa, additional classifiers 
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(support vector machines, an object-based classifier) were 
also used. In addition, in order to take the geographical 
variability within the mapped area into account, producers 
usually split the classification into agro-ecological zones 
(AEZs). 


Product description 


GFSAD30 is distributed in tiles with a 10° edge for each of 
the mapped regions. Datasets are available from different 
servers or tools, including Data Pool, NASA Earthdata 
Search, USGS EarthExplorer and the DAAC2Disk Utility. 
We recommend users to download the dataset through 
NASA Earthdata Search and USGS EarthExplorer, on which 
the geographical coverage of each tile can be visualized. 

In most cases, the download only includes a raster file 
with the extent of cropland in .tiff format. Nonetheless, the 
download from the Data Pool server also includes a metadata 
file and a preview image of the product. 


Downloads 


GFSAD30AFCE v001 


— Raster file with cropland extent 


Legend and codification 


Code Label 

0 Water 

1 Non-Cropland 
2 Cropland 


Practical considerations 


The global map obtained after merging the 7 GFSAD30 
datasets can be consulted online at the project's website.’ 
The website also includes other important products for 
mapping cropland at coarser scales (250 m, 1 km), as well as 
datasets about irrigated/rainfed cropland areas for South 
Asia, Iran, Afghanistan and Australia. Users can also 
download a dataset validating the product 
(GFSAD30VAL).* 

In addition to the technical documentation published as 
reports and papers in journals, other interesting technical 
documents are also available on the website.” 


3 www.croplands.org. 
^ https;//Ipdaac.usgs.gov/products/gfsad30valv001/. 
? https://www.croplands.org/documents. 
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8 ASAP Land Cover Masks 


Product 

LULC thematic 

Dates 

2019 

Formats 

Raster 

Pixel size 

1 km (resampled from 250 m original resolution) 
Theme 

Percentage of cropland/rangeland covers 
Extent 

Global 

Updating 

Not planned 

Change detection 


No (only one date) 


Overall accuracy 
Not specified 
Website of reference Website Language English 
https://mars.jrc.ec.europa.eu/asap/index.php 
Download site 
https://mars.jrc.ec.europa.eu/asap/download.php 
Availability Format(s) 
Open Access „tiff 
Technical documentation 
Meroni et al. (2019) 
Other references of interest 
Pérez-Hoyos et al. (2017a), Pérez-Hoyos et al. (2017b), Rembold et al. (2019), Vancutsem et al. (2013) 


Global Thematic Land Use Cover Datasets Characterizing ... 
Project 


Anomaly hot Spots of Agricultural Production (ASAP) is an 
online decision support system developed and maintained by 
the Monitoring Agricultural Resources unit (MARS) of the 
Joint Research Centre (JRC) of the European Commission to 
monitor anomalies in global agricultural production. The 
system supports early warnings and assessments on food 
security, so providing a useful tool for many international 
organizations working in this field. 

Two land cover maps charting global crop and rangeland 
cover fractions were specifically produced for ASAP and are 
accessible to any interested user. These layers are required to 
compute anomalies based on rainfall and vegetation index 
data, which are later translated into timely warnings about 
potential food security problems. 

The maps rely on previous work carried out for similar 
purposes by the JRC. In their studies of Africa, the maps 
follow a similar approach to that proposed by Vancutsem 
et al. (2013) and further refined by Pérez Hoyos (20172). 


Production method 


The cropland and rangeland cover maps for ASAP were 
produced by combining the best available LUC data for each 
country. To select the best available source for each case, 
different criteria were employed depending on the country or 
geographical area. The selected data sources for each map 
(cropland, rangeland) also varied. 

For Africa and part of Asia (Bangladesh, Indonesia, Laos, 
Myanmar, Thailand, Timor-Leste, Philippines and Vietnam), 
8 global LUC datasets (CGLS-LC100, GLC2000, 
GLCNMO, GlobCover, GLC30, LC-CCI, MODISLC, S2 
Prototype Land Cover) were compared according to different 
criteria. In the African case, the most suitable dataset was 
selected on the basis of timeliness, spatial resolution, 
agreement with FAO statistics, accuracy and expert knowl- 
edge. In the Asian case, only accuracy and agreement with 
FAO statistics were considered. 

For the rest of the countries, when a suitable regional 
dataset was available, this was the one selected. In the cases 
when a suitable dataset was not available, the global LUC 
dataset with the highest spatial resolution was chosen. If this 
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was not considered valid when assessed against Google 
Earth imagery, the FAO-GLCshare dataset was selected in 
its place. 

The maps were initially produced at 250 m and later 
resampled at 1km in line with the requirements of the ASAP 
system. 


Product description 

The raster files containing the cropland and rangeland cover 
information can be downloaded from the ASAP website. No 
auxiliary information is available for these datasets. 


Downloads 


ASAP crop mask 
— Raster file with cropland percentage (.tiff) 


ASAP rangeland mask 


— Raster file with rangeland percentage (.tiff) 


Legend and codification 


ASAP crop mask 
Code 
0-100 


Label 
Cropland Coverage (0-100%) 


ASAP rangeland mask 
Code 
0-100 


Label 
Rangeland Coverage (0—10096) 


Practical considerations 


Although not directly available for download, access to the 
original map at a spatial resolution of 250m is possible on 
request to the members of the ASAP Team.° Previous ver- 
sions of the dataset for Africa developed by Vancutsem et al. 
(2013) and Pérez Hoyos (20172) can also be accessed in the 
same way. 


6 https://mars.jrc.ec.europa.eu/asap/about.php. 
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Abstract 


The mapping of artificial covers at a global scale has 
received increasing attention in recent years. Numerous 
thematic global Land Use Cover (LUC) datasets focusing 
on artificial surfaces have been produced at increasingly 
high spatial resolutions and using methods that ensure 
improved levels of accuracy. In fact, there are several 
long time series of maps showing the evolution of 
artificial surfaces from the 1980s to the present. Most of 
them allow for change detection over time, which is 
possible, thanks to the high level of accuracy at which 
artificial surfaces can be mapped and because transitions 
from artificial to non-artificial covers are very rare. Global 
thematic LUC datasets characterizing artificial covers 
usually map the extent or percentage of artificial or urban 
areas across the world. They do not provide thematic 
detail on the different uses or covers that make up 
artificial or urban surfaces. Unlike other general or 
thematic LUC datasets, those focusing on artificial covers 
make extensive use of radar data. In several cases, optical 
and radar imagery have been used together, as each 
source provides complementary information. Global 
Urban Expansion 1992-2016 and ISA, which were 
produced at a spatial resolution of 1 km, are the coarsest 
of the nine datasets reviewed in this chapter. ISA provides 
information on the percentage of impervious surface area 
per pixel. The GHSL edition of 2014 and the GMIS at 
30 m also provide sub-pixel information, whereas all the 
other datasets reviewed here only map the extent of 
artificial/impervious/urban areas. Most of the datasets 
reviewed in this chapter were produced at a spatial 


resolution of 30 m. This is due to the extensive use of 
Landsat imagery in the production of these datasets. 
Landsat provides a long, high-resolution series of satellite 
imagery that enables effective mapping of the evolution 
of impervious surfaces at detailed scales. Of the datasets 
produced at 30 m, Global Urban Land maps artificial 
covers for seven different dates between 1980 and 2015, 
while GHSL does the same for five different dates 
between 1987 and 2016, although the map for the last 
date was produced at 20 m. GUB maps the extent of 
urban land for seven dates between 1990 and 2018 and 
was produced together with GAIA, which provides an 
annual series of maps for the period 1985-2018. HBASE, 
GMIS and GISM, also at 30 m, are only available for one 
reference year. The same is true of GUF and WSF, which 
were produced as part of the same effort to map global 
artificial surfaces as accurately as possible. They provide 
the most detailed datasets up to date, with spatial 
resolutions of 12 m (GUF) and 10 m (WSF). Future 
updates of WSF will produce a consistent time series of 
global LC maps of artificial areas from the 1980s to the 
present. It aims to be the longest, most detailed, most 
accurate dataset ever produced on this subject. 
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1 Global Urban Land 


Website of reference 
http://www.geosimulation.cn/GlobalUrbanLand.html 
Download site 
http://www.geosimulation.cn/GlobalUrbanLand.html 
Availability 

Open Access 

Technical documentation 

Liu et al. (2018) 


Other references of interest 


Product 

LULC thematic 

Dates 

1980, 1990, 1995, 2000, 2005, 2010, 2015 
Formats 

Raster 

Pixel size 

30m 

Theme 

Extent of artificial areas 
Extent 

Global 

Updating 

Not planned 

Change detection 

Yes 

Overall accuracy 
Expected to be > 80% 


Website Language English 


Format(s) 
„tiff 


D. García-Álvarez et al. 


Global Thematic Land Use Cover Datasets ... 
Project 


Global Urban Land, also referred to as Multi-temporal 
Global Impervious Surface (MGIS), is a project developed 
by researchers from different Chinese universities (Sun 
Yat-sen, East China Normal, Guangzhou and Jiangsu Nor- 
mal) to create a high-resolution multi-temporal urban land 
dataset. They aimed to provide high-resolution data about 
urban areas at multiple dates, which could be useful for those 
studying urbanization and the impact of artificial surfaces 
and human activities on the environment. 

In this dataset, urban land is understood as an impervious 
surface. It can therefore be assimilated to all the datasets 
mapping artificial or impervious surfaces, such as GAIA. 
Initially, the dataset was produced for the period 1990—2010, 
with maps every 5 years. However, it has since been upda- 
ted, with new data for the years 1980 and 2015. 


Production method 


Global Urban Land is obtained through an index-based 
method that automatically predicts urban land: the Normal- 
ized Urban Areas Composite Index (NUACI). The index, 
implemented through the Google Earth Engine (GEE) plat- 
form, uses Landsat imagery and DMSP-OLS nighttime 
lights images as inputs. 

To calibrate the index, the world was stratified into dif- 
ferent urban ecoregion categories, according to the particular 
physical and socioeconomic characteristics of each urban 
region. Three indexes (NDWI, NDVI and NDBI) were 
extracted from Landsat imagery to calculate the NUACI. In 
addition, a binary mask was obtained by segmenting 
DMSP-OLS nighttime lights images into urban and 
non-urban by applying a specific threshold. On the basis of 
these data, the NUACI index was calculated, obtaining a raster 
showing the percentage of impervious surface area per pixel. 

The final Global Urban Land dataset was obtained after 
applying region-specific segmentation thresholds to the 
NUACI images showing the degree of imperviousness. After 
this step, a binary urban/non-urban map was generated. 

For the calibration of the NUACI index, as well as for the 
application of segmentation thresholds, cities were randomly 
assigned to three equal-sized groups: centroid sites, thresh- 
old sites and testing sites. Different criteria for index 
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calibration and threshold segmentation were decided for 
each type of site. 


Product description 


The Global Urban Land dataset can be downloaded from 
three different servers: Baidu Drive, Google Drive and 
FTP. From them, users will be able to separately download 
the dataset for each of the available years of reference. For 
each year, there is a compressed folder (.zip) containing the 
whole dataset distributed in tiles. 

An auxiliary vector file (.shp) is provided to help users 
identify the number of the files corresponding to their area of 
interest (field “grid_id”). The scientific paper presenting the 
dataset is also available for download, together with a text 
file with relevant technical information about the product 
and the reference data used to produce the dataset for the 
initial period 1990—2010. 


Downloads 

Global Urban Land 2010 

— Raster files with the extent of the artificial surfaces for each tile into 
which the dataset was divided (.tiff) 


Legend and codification 


Code Label Code 
0 Non-urban land 1 


Label 
Urban land 


Practical considerations 


The authors have identified several uncertainties and limi- 
tations in the dataset. The 1990 map has missing data areas 
due to the lack of Landsat imagery or reference data for these 
areas. The binary mask used to create the dataset may also 
introduce some uncertainties, as it was unable to detect some 
urban infrastructure. In addition, the accuracy of the dataset 
is relatively low in arid and tropical areas. The authors also 
described the limitations associated with a binary 
(urban/non-urban) mapping approach, which oversimplifies 
the real situation being mapped. 
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2 GHSL (Global Human Settlement Layer)—Built-up Area 


D. García-Álvarez et al. 


Product 
LULC thematic 


Dates 


1975, 1990, 2000, 2014 
2016 
2018 


Formats 


Raster 


Pixel size 


10 m (2018) 
20 m (2016) 
30 m, 250 m and 1 km (1975-2014) 


Theme 


Extent of built-up areas (1975-2014, 2016) 
Built-up areas probability (2018) 


Extent 


Global 


Updating 


Expected 


Change detection 


Yes, except for the 2016 and 2018 layers 


Overall accuracy 


Expected to be » 8996 (2014) 


Website of reference 


Website Language English 


https://ghsl.jrc.ec.europa.eu/index.php 


Download site 


https://ghsl.jrc.ec.europa.eu/download.php 


Availability 


Format(s) 


Open Access 


tiff 


Technical documentation 


Corbane et al. (2018), (2019a), (2019b), (2021), Pesaresi et al. (2016a) 


Other references of interest 


Joint Research Centre (2020), Melchiorri et al. (2018), (2019), Pesaresi et al. (2016b) 


Global Thematic Land Use Cover Datasets ... 
Project 


The GHSL is a project supported by the European Com- 
mission through its Joint Research Centre (JRC) and the 
Directorate General for Regional and Urban Policy (DG 
REGIO) and for Internal Market, Industry, Entrepreneurship 
and SMEs (DG GROWTH). The project is part of the 
Human Planet Initiative of the Group on Earth Observations 
(GEO). It builds on the research activity carried out by the 
JRC since 2010. 

The project aims to provide high-quality, detailed data 
that characterize human settlements at a global level over a 
period of time. The datasets obtained enable us to understand 
where people live and how human settlements have evolved 
over time. This provides a useful source of information in 
support of policy- and decision-making. In this regard, one 
of the purposes of this project is to contribute to the devel- 
opment of the indicators required to measure different policy 
objectives. 

The project has delivered three main products, one of 
them referring to the urban footprint of human settlements 
(GHS-BUILT). This is the product described here, because 
of its assimilation to a Land Cover product. The other two 
products include a global grid of population density 
(GHS-POP) and a spatial layer of urban settlements classi- 
fied according to their typology (GHS-SMOD). They have 
been produced for the same three time points and are based 
on the initial GHS-BUILT layers. 

GHS-BUILT was initially produced for the years 1975, 
1990, 2000 and 2014, providing a consistent time series of 
maps. They are available at three spatial resolutions, the 
finest one (30 m) providing information on the extent of the 
built-up areas. The aggregated maps (250 m, 1 km) give 
information on the percent of built-up areas per pixel. 

New editions of the GHS-BUILT product have recently 
been released for the years 2016 and 2018. However, they 
are based on different imagery (Sentinel-1 and Sentinel-2) 
and were obtained using different methods. They are there- 
fore not comparable to previous maps. 


Production method 


The GHS-BUILT maps for the period 1975-2014 were 
produced by classifying the historical archive of Landsat 
imagery through a Symbolic Machine Learning 
(SML) classifier. This is a supervised classifier that builds on 
a set of learning data. It includes previous information from 
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older versions of the same product and other auxiliary 
datasets like the GLC30 or a global surface water product. 

The classifier helped extract the following earth features 
from the imagery: clouds, water and built-up. After classi- 
fying the imagery mosaics for each of the periods under 
consideration (1975, 1990, 2000 and 2014), the classifica- 
tions were then merged, thus ensuring the consistency over 
time of the historical series of maps. 

The 2016 GHS-BUILT was also obtained using the SML 
classifier. However, the classification was carried out over 
Sentinel-1 backscatter imagery, so adapting the classifier to 
the potential and characteristics of this source of imagery. 
Certain differences can also be identified with regard to the 
learning data used in the image classification. 

The 2018 GHS-BUILT was obtained by classifying a 
global cloud-free composite of Sentinel-2 imagery through a 
deep-learning-based framework, which is called the 
GHS-S2Net approach. A specific model was trained for each 
UTM grid zone of the global map, which allowed to account 
for local variability and computational model requirements. 
The model builds on a convolution neural networks archi- 
tecture, which calculates the probability of built-up areas per 
pixel. Each model was trained with data from previous 
GHS-BUILT datasets, the European Settlement Map (see 
Sect. 6 in chapter “Supra-National Thematic Land Use 
Cover Datasets”), Facebook high-resolution settlement data 
and Microsoft building footprint data. 


Product description 


GHS-BUILT for the period 1975-2014 can be downloaded 
in small tiles or as a single global file. It is also provided at 
three different spatial resolutions (30 m, 250 m and 1 km) 
and in two different projections (Mollweide and Mercator). 

The map at 30m can only be downloaded as a 
multi-temporal product, providing information about the 
urban footprint for the whole period covered by the product 
(1975-2014). Maps at 250 m and 1 km can also be down- 
loaded for specific years, without reference to built-up areas 
for other time points. 

The dataset for 2016 obtained from Sentinel-1 imagery 
can only be downloaded for the whole world as a single 
zipped file. The dataset for 2018 is distributed in tiles cor- 
responding with UTM grid zones. A vector layer repre- 
senting the UTM grid zones in which the product is split can 
be downloaded as an auxiliary file, together with the pro- 
duct's metadata. 
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Downloads 


GHS—Built-up 2018 (10 m) 


— Raster file with LUC information 


GHS—Built-up 2016 


— Raster files with LUC information for each of the tiles in which the 
product is divided (.tiff) (13 OTSU folder) 

— Global mosaic of the product (.vrt) (V1-0) 

— Vector file representing the tiles in which the product is distributed 
Gshp) (V1-0) 

— PDF with the description of the product 


GHS—Built-up 2014 (250 m) 


— Raster file with LUC information 
— PDF with the description of the product 


GHS—Built-up multi-temporal (30 m) 


— Raster file with LUC information 
— PDF with the description of the product 


Legend and codification 


GHS—Built-up 2018 (10 m) 


Code Label 

0—100 Probability of being built-up area 
(1-100) 

255 No data 

GHS—Built-up 2016 (20 m) 

Code Label 

0 No built-up/no data 

1 Built-up area 


(continued) 
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GHS—Built-up 2014 (250 m) 


Code Label 

GHS—Built-up 2014 (250 m) 

Code Label 

0-100 Built-up area density (1-100) 
—200 No data 


GHS—Built-up multi-temporal (30 m) 


Code | Label Code | Label 

0 No data 4 Built-up from 1990 to 
2000 epochs 

1 Water surface 5 Built-up from 1975 to 
1990 epochs 

2 Land not built-up in 6 Built-up to 1975 epoch 

any epoch 
3 Built-up from 2000 to 


2014 epochs 


Practical considerations 


The maps for 2016 and 2018 are a test version of the product 
obtained with Sentinel-1 and Sentinel-2 imagery. They 
should not be therefore used together with the other 
GHS-BUILT maps, as if they were part of the same series of 
maps. 

Users interested in the method used to produce this 
dataset can find the general workflow for built-up areas 
extraction in the MASADA (Massive Spatial Automatic 
Data Analytics) tool. ' 


! https://ghsl.jrc.ec.europa.eu/tools.php. 
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3 GAIA—Global Artificial Impervious Areas, GUB—Global Urban Boundaries 


Product 
LULC thematic 
Dates 


1985-2018 (GAIA) 
1990, 1995, 2000, 2005, 2010, 2015, 2018 (GUB) 


Formats 

Raster (GAIA), Vector (GUB) 
Pixel size 

30m 

Theme 


Extent of artificial areas 
Urban boundaries 


Extent 

Global 

Updating 

Not planned 

Change detection 

Yes 

Overall accuracy 

Expected to be > 89% (GAIA) 


Website of reference Website Language English 
http://data.ess.tsinghua.edu.cn/ 
Download site 


http://data.ess.tsinghua.edu.cn/gaia.html (GAIA) 
http://data.ess.tsinghua.edu.cn/gub.html (GUB) 


Availability Format(s) 
Open Access .tiff, .shp 
Technical documentation 

Gong et al. (2020), Li et al. (2020) 

Other references of interest 

Gong et al. (2019), Li et al. (2015), Li and Gong (2016) 
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Project 


This project was led by researchers from Tsinghua Univer- 
sity with the collaboration of colleagues from other Chinese 
and American universities, together with Google and the US 
Geological Survey. They have produced two different data- 
sets: Global Artificial Impervious Areas (GAIA) and Global 
Urban Boundaries (GUB). The second was obtained from 
the first and both were produced to better understand global 
urbanization and other human socioeconomic activities and 
their impacts on the environment. 

GAIA maps artificial surfaces across the world, whereas 
GUB maps urban areas. Unlike GAIA, GUB does not 
include small urban patches. In addition, in the GUB dataset 
non-artificial areas within cities, such as green areas or water 
bodies, are considered urban. 

The project took advantage of the full Landsat data 
archive (1985-2018), providing a temporally consistent 
series of maps in which the only change possible was from 
non-artificial to artificial surfaces. The project is part of the 
global LUC mapping efforts carried out by Tsinghua 
University, such as FROM-GLC or GLC250, which are 
described in previous chapters of this book. 


Production method 


GAIA was first produced via the classification of the Landsat 
imagery archive (1985-2018). The dataset obtained in this 
way was then used to produce GUB. Google Earth Engine 
(GEE) was used to create both datasets. 

Two different classification methodologies were followed 
to obtain GAIA: one for non-arid regions and the other for 
arid ones. This is due to the spectral confusion between 
impervious areas and bare lands. For classification purposes, 
the world was split into 583 tiles, of which 155 referred to 
arid environments. 

The classification of non-arid areas was based on previ- 
ous experiences of the production team in mapping artificial 
areas at local and national scales. Annual artificial areas were 
first obtained through an “ExclusionInclusion” algorithm, 
based on training data from earlier Landsat datasets and 
Google Earth imagery, and NVDI, MNDWI and SWIR data 
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from Landsat imagery. The time series of maps obtained in 
this way was then further refined through a “temporal con- 
sistency check" approach. 

For arid areas, a primary urban mask was first obtained 
for the year 2018 based on radar data from Sentinel-1 and 
VIIRS NTL. The classification of Sentinel-1 data was based 
on backscatter coefficients and NTL data was classified 
according to the quantile-based method. In both cases, dif- 
ferent parameters were used for each arid biome. Once the 
two urban masks for 2018 had been obtained, they were 
mixed. Then, the time series of maps was created using the 
same “ExclusionInclusion” algorithm and “temporal con- 
sistency check” approach applied to the non-arid regions. 

The GUB dataset was later obtained on the basis of a 
combination of two inputs: a kernel density map at a spatial 
resolution of 1 km obtained from GAIA based on a kernel 
density estimation (KDE) approach; and an initial urban 
boundary obtained from a Cellular Automata-based 
(CA) modelling exercise at 30m. The results were 
improved through a morphological approach with dilation 
and erosion processing. This last step improved the mapped 
urban boundaries around fringe urban areas. Small holes 
inside urban areas were removed in a post-processing stage. 


Product description 


GAIA is distributed in 3.5? x 3.5? tiles, named according to 
the latitude and longitude of their upper-left coordinates. 
Users can download a vector file (.shp) drawing all the tiles 
and providing their names (field *FName ID")? GUB is 
distributed as a single global file for each of the 7 years 
available. 


Downloads 


GAIA 


— A raster file with the extent of artificial areas (.tiff) 


GUB 


— A vector file with urban boundaries (.shp) 


? http;//data.ess.tsinghua.edu.cn/data/GAIA/GAIA  shape.zip. 
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Legend and codification 


GAIA 

Code Label* 
1 2018 
2 2017 
3 2016 
4 2015 

5 2014 
6 2013 
7 2012 


"The label refers to the time when the pixel was sealed 


Database 
GUB 
GUB 
ORIGFID * 
1 65461 
2 65460 
3 65459 
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urbanArea 


1,67112000000 
9,33964000000 


2,31508000000 


Label 
2011 
2010 
2009 
2008 
2007 
2006 
2005 


— Orig FID: Unique identifier for each polygon 
— UrbanArea: area of the delimited urban area 


Code 
15 
16 
17 
18 
19 
20 
21 


Label 
2004 
2003 
2002 
2001 
2000 
1999 
1998 


Code 
22 
23 
24 
25 
26 
27 
28 


Label 
1997 
1996 
1995 
1994 
1993 
1992 
1991 


Code 
29 
30 
31 
32 
33 
34 


Label 
1990 
1989 
1988 
1987 
1986 
1985 


427 


428 


4 Global Urban Expansion 1992-2016 
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Website of reference 
https://doi.pangaea.de/10.1594/PANGAEA.892684 
Download site 
https://doi.pangaea.de/10.1594/PANGAEA.892684 
Availability 

Open Access 

Technical documentation 

He et al. (2019) 


Other references of interest 


Product 

LULC thematic 
Dates 

1992, 1996, 2000, 2006, 2010, 2016 
Formats 

Raster 

Pixel size 

1 km 

Theme 

Extent of Urban areas 
Extent 

Global 

Updating 

Not expected 
Change detection 
Yes 

Overall accuracy 


Expected to be > 90% 


Website Language English 


Format(s) 
„tiff 


D. García-Álvarez et al. 


Global Thematic Land Use Cover Datasets ... 
Project 


The dataset on Global Urban Expansion is the result of the 
work carried out by a group of researchers from the Beijing 
Normal University, the China University of Geosciences and 
Murray State University in the USA. Their aim was to create 
a new dataset on urban expansion using fully convolutional 
network (FCN)-based methods, which would be able to 
overcome some of the limitations of previous datasets on the 
same topic: outdated datasets, low spatial resolutions and 
low levels of accuracy. 

The dataset provides useful information for studies 
addressing global urbanization and its impacts on the envi- 
ronment. It considers as urban all those built-up areas where 
human-constructed or artificial elements cover more than 
half of the area or pixel. 


Production method 


A specific fully convolutional network (FCN) was devel- 
oped to map the urban areas in the Global Urban Expansion 
dataset. FCN are deep learning structures based on convo- 
lutional neural networks (CNN) that employ pixel-to-pixel 
image recognition. 

The FCN was fed with different sources of input data: 
Nighttime Light (NTL) imagery from NOAA and 
NPP-VIIRS, as well as Normalized Difference Vegetation 
Index (NDVI) and Land Surface Temperature (LST) data 
from MODIS. Other auxiliary data sources were also 
employed to obtain the Global Urban Expansion dataset: 
urban population statistics, Landsat imagery and the GHS 
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and LC-CCI LUC datasets. LST data is only available for the 
period 2000-2016 and was not used to map the urban areas 
in 1992 and 1996. 

The FCN was calibrated with data from MODIS Land 
Cover, differentiating urban from non-urban areas. The 
calibration provided the weights of the FCN, which were 
then used to obtain the final Global Urban Expansion 
dataset. 

A post-classification stage using population density data 
was carried out to ensure the consistency over time of the 
maps obtained. 


Product description 


The dataset can be downloaded as a single compressed file (. 
Zip), including the raster files showing the urban expansion 
for each available year. No auxiliary information is provided 
with the dataset. 


Downloads 


Global Urban Expansion 


— Raster files with urban expansion data for each mapped year (.tiff) 


Legend and codification 


Code Label Code Label 


0 Non-urban area 1 Urban area 
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5  ISA—Global Inventory of the Spatial Distribution and Density of Constructed Impervious 
Surface Area 


Product 

LULC thematic 
Dates 

2000 / 01, 2010 
Formats 

Raster 

Pixel size 

] km 

Theme 
Impervious area density (O-100%) 
Extent 

Global 

Updating 

Not expected 
Change detection 
Unknown 
Overall accuracy 


Not specified 


Website of reference Website Language English 
https://www.ngdc.noaa.gov/eog/dmsp/download. global isa.html 

Download site 

https://www.ngdc.noaa.gov/eog/dmsp/download. global isa.html 

Availability Format(s) 

Open Access tiff 

Technical documentation 

Elvidge et al. (2007) 

Other references of interest 

Elvidge et al. (2004) 


Global Thematic Land Use Cover Datasets ... 
Project 


ISA is the result of a project partially funded by NASA's 
Carbon Cycle research program and is made up of 
researchers from different American institutions and uni- 
versities. It builds on a previous attempt to map Impervious 
Surface Area (ISA) for the USA led by the NOAA (National 
Oceanic and Atmospheric Administration). 

ISA was initially produced for the reference year 
2000/01. A new version of the dataset is available for 2010. 
The dataset is useful for understanding the global distribu- 
tion of impervious areas and for studies analysing the impact 
of these covers and their associated uses on the environment. 

In addition to the production of an ISA density grid, the 
project's outputs also include spreadsheets with information 
about the quantity of ISA per person at a country level and 
the ISA density per watershed areas. These are classified 
according to the proportion of ISA in three groups: stressed 
(1-10% ISA), impacted (10-25%) and degraded (22596). 


Production method 


The ISA density grid for the reference year 2000/01 was 
obtained through a model making use of night-time lights 
imagery (DMSP OLS) and a population count grid 
(LandScan). Night lights imagery were captured in 2000-01, 
whereas the population count grid dates from 2004. A linear 
regression was defined to estimate the ISA density based on 
those two inputs. Only cells with a population count of at 
least 3 were considered in the regression. The model was 
calibrated with the ISA dataset produced for the USA at 
30 m. 

There is no accompanying information about the pro- 
duction process of the 2010 map. Therefore, we cannot 
know if it followed the same method as the previous map or 
some changes were introduced in the production process. 
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Product description 


The ISA dataset for the reference year 2010 is distributed as 
a single compressed file (.gz). For the reference year 
2000/01, the dataset is distributed in two different projec- 
tions (GCS, Mollweide) and formats (ENVI, GeoTiff). 

Spreadsheets containing ISA information per country and 
watershed are also available on the project website. This data 
is distributed together with a text file offering a technical 
explanation of these results. 


Downloads 


ISA (GeoTiff 2000-2001) 
— Raster file with ISA proportion (tiff) 


Legend and codification 


Code 
0-100 


Label 


Impervious area density 


Practical considerations 


Although there is an ISA map for 2010, no information is 
available about the way it was produced. If there were 
important differences between the production methods used 
in 2000/01 and 2010 editions of ISA, they could not be used 
for comparison purposes or land change studies. 

ISA was obtained from a calibration based on data for the 
USA. This may make the final result less accurate for countries 
with different night lights conditions, such as African coun- 
tries. It is therefore likely that this dataset underestimates ISA 
densities in many different parts of the world. 
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6 HBASE and GMIS (Global High Resolution Urban Data from Landsat) 


Website of reference 
https://sedac.ciesin.columbia.edu/data/collection/ulandsat 


Download site 


Product 

LULC thematic 
Dates 

2010 

Formats 

Raster 

Pixel size 

30 m, 250 m, 1 km 
Theme 


Extent of urban areas 
Percentage of impervious areas 


Extent 
Global 


Updating 

Not expected 
Change detection 
No (only one date) 
Overall accuracy 


Not specified 


Website Language English 


https://sedac.ciesin.columbia.edu/data/set/ulandsat-hbase-v 1/data-download 
https://sedac.ciesin.columbia.edu/data/set/ulandsat-gmis-v 1/data-download 


Availability 

Open Access after registration 

Technical documentation 

De Colstoun et al. (2017), Wang et al. (2017) 


Other references of interest 


Format(s) 
„tiff 
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Global Thematic Land Use Cover Datasets ... 
Project 


Researchers from NASA, in collaboration with the Univer- 
sity of Maryland and other American institutions, created 
two datasets to globally map artificial areas across the world: 
Global Human Built-up and Settlement Extent (HBASE) 
and Global Man-made Impervious Surface (GMIS). These 
were created within the context of NASA's Land Cover and 
Land Use Change (LCLUC) program. 

Both datasets used Landsat imagery available through the 
Landsat Global Land Survey (GLS) archive to consistently 
map impervious surfaces across the globe at high spatial 
resolution for the reference year 2010. These datasets aimed 
to overcome the resolution-related limitations of previous 
datasets. They can be useful for anyone studying impervious 
surfaces, their impact on the environment or their relation 
with other land dynamics. Because of the detail they pro- 
vide, they can be used for studies and applications at global, 
supra-national, national and local scales. 

HBASE and GMIS are complementary datasets, jointly 
produced to address the spectral confusion arising from the 
fact that many impervious areas are sealed with soil, sand, 
rocks, etc. and can therefore be confused with bare land. 
The HBASE dataset provides a mask to remove such areas 
from the GMIS dataset. 


Production method 


HBASE and GMIS were produced separately, although the 
first was used as a mask in the production of the second. In 
both cases, the GLS 2010 Surface Reflectance Dataset from 
Landsat was the input imagery. 

For the production of HBASE, the first stage was to seg- 
ment the GLC imagery using a Recursive Hierarchical Image 
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Segmentation (RHSeg) software package. This produced a 
series of objects, from which different textures and other 
variables were extracted. On the basis of these variables, a 
random forest (RF) classification was carried out to classify 
the segmented objects in HBASE/non-HBASE categories. 
Training data for the classification was obtained from Landsat 
and Google Earth imagery. In addition, OpenStreetMap was 
used as an auxiliary dataset in the post-classification process 
to improve the mapping of the roads, which had not been 
correctly classified in the previous stages. 

GMIS was obtained in two steps, with classifications 
carried out at the scene level. First, an object-based classi- 
fication of GLC imagery was performed using the HSeg 
(Hierarchical Image Segmentation) Learn software to clas- 
sify all the areas as either impervious or non-impervious. 
Only pixels effectively classified as HBASE in the previous 
dataset were considered and pixels with a low-quality clas- 
sification were discarded. Later, the percentage of impervi- 
ous area per pixel was calculated for all pixels classified as 
impervious through a regression-tree algorithm (Cubist). The 
algorithm was run with reference data from the National 
Geospatial-Intelligence Agency (NGA) at a spatial resolu- 
tion of 30 m. 


Product description 


An online viewer allows users to download HBASE and 
GMIS: (i) for a specific country, (11) for the tiles into which 
the datasets are split or (iii) for user-defined areas of interest 
(by drawing a polygon or shape or uploading a shapefile file 
that defines the area). The files can be downloaded at the 
original resolution (30 m) and resampled at 250 m and 
1 km. Users can also choose between two projections: 
geographic or UTM. 


? The datasets are split into tiles corresponding to the UTM zones. 
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Other complementary products are also available for 
download: a layer of standard error for the production of the 
GMIS dataset and an HBASE probability layer. 


Downloads 
GMIS 
— Raster files with information on the percentage of impervious 


surface area (.tiff) 
— A text document with technical information about the product (.txt) 


HBASE 


— Raster files with information on the urban extent (.tiff) 
— A text document with technical information about the product (.txt) 


Legend and codification 


Global Man-made Impervious Surface (GMIS)—Percentage 


Code Label 

0-100 Percentage of impervious surface area (0-100%) 
200 Non-HBASE 

255 No data, clouds, shadows 
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Global Human Built-up and Settlement Extent (HBASE) 


Code Label Code Label 
200 Non-HBASE 202 Road 
201 HBASE 255 No data, clouds, shadows 


Practical considerations 


Users can explore the different datasets available online,* 
including the complementary layer about the standard error 
of the Impervious Surface Percentage raster and the HBASE 
probability layer. Full metadata for GMIS and HBASE is 
also available online.” 

GMIS and HBASE have some limitations associated with 
their production methodology. For example, they may pre- 
sent areas of missing information due to cloud cover or other 
factors. The technical documents for the product (cited 
below) provide a detailed description of all these limitations. 

As part of the same project, Landsat imagery composites 
for 66 urban areas are also available for download.? 


^ https://sedac.ciesin.columbia.edu/mapping/gmis-hbase/explore-view/. 
? https://sedac.ciesin.columbia.edu/data/set/ulandsat-hbase-v 1 /metadata 
-https://sedac.ciesin.columbia.edu/data/set/ulandsat-gmis-v 1/metadata. 
ê https://sedac.ciesin.columbia.edu/data/set/ulandsat-cities-from-space. 
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7 GUF—Global Urban Footprint 


Product 

LULC thematic 
Dates 

2011 


Formats 
Raster 


Pixel size 

0.4 arc seconds (~ 12 m near the Equator) 
2.8 arc seconds (~ 84 m near the Equator) 
Theme 

Extent of built-up areas 

Extent 

Global 

Updating 

Not expected 

Change detection 


No (only one date) 


Overall accuracy 
Not specified 


Website of reference Website Language English 
https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-9628/16557. read-40454/ 

Download site 

https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-11725/20508 read-47944/ 

Availability Format(s) 


Open Access on request after filling in a request form „tiff 
Technical documentation 

Esch et al. (2010), (2012), (2013), (2017) 

Other references of interest 

Esch et al. (2011), (2014), (2018a), (2018b), (2020), Marconcini et al. (2014) 
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436 
Project 


Global Urban Footprint (GUF) is a dataset produced by the 
German Aerospace Center (DLR) from radar imagery at 
very high spatial resolution: 0.4 arc seconds, which is 
equivalent to about 12 m at the Equator. The dataset at the 
highest resolution is envisaged for scientific uses, whereas a 
coarser resolution of the dataset at 2.8 arc seconds (~ 84 m 
near the Equator) has also been produced for 
non-commercial use by the general public. 

The dataset aims to facilitate the quantitative and qual- 
itative characterization of urban surfaces (size, form, spatial 
distribution) at different scales, from local to continental 
and global. Because of its high resolution, it allows all 
artificial surfaces to be analysed, in both urban and rural 
landscapes. This information is useful for researchers 
investigating the different impacts of the urbanization 
process, be they environmental, economic, political, soci- 
etal or cultural. 

The dataset was produced to overcome some of the 
limitations associated with previous global datasets on 
impervious surfaces, usually produced from demographic 
data. In this regard, by the time it was produced, high spatial 
resolution datasets were only available for specific regions, 
such as North America and Europe. 

The project is part of the Urban Thematic Exploitation 
Platform (U-TEP) of the European Space Agency (ESA), 
which explores new methods and techniques to understand 
urban patterns and dynamics across the world. U-TEP is one 
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of the seven Thematic Exploitation Platforms developed by 
the ESA to help data user communities. 

In the context of U-TEP, DLR has also developed the 
WSF dataset, which outperforms GUF and resolves some of 
the limitations associated with it. WSF, which is described 
later on in this chapter, is a natural progression from the 
work undertaken to produce GUF. The two datasets are 
closely linked. 

Based on GUF, a new layer on global built-up density 
was produced at a spatial resolution of 30 m for the refer- 
ence year 2012 (GUF-DenS 2012). It provides information 
about the percentage of sealed surface or greenness per cell. 
Other complementary products based on GUF have been 
also produced, although they have not been made available 
to the public, namely a layer characterizing settlement 
properties and patterns (GUF-NetS) and a layer defining the 
average building height (GUF-3D). 


Production method 


GUF was produced from radar imagery from the 
TerraSAR-X/TanDEM-X satellites at a spatial resolution of 
3 m. The imagery was captured between 2011 and 2012, 
except for a few images from the years 2013 to 2014. 

The first stage of the production process was to extract a 
texture feature (speckle divergence) from the input imagery. 
Then, based on those features, a binary settlement layer 
differentiating between built-up and non-built-up areas was 
generated through an automatic unsupervised classifier: 
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Support Vector Data Description (SVDD) one-class classi- 
fication. The classification was carried out in 5? x 5? tiles. 
Once all the tiles had been processed, the obtained layers 
were mosaicked. 

In a post-classification stage, the dataset was assessed 
against reference data, which confirmed or excluded the 
presence of built-up surfaces: Open Street Map, GLC30, 
NLCD, Imperviousness HRL and SRTM DEM. 

Seven different layers were finally obtained on the basis 
of different classification settings: from very conservative 
settings (version 1) to very relaxed settings (version 7). 
Version 1 followed very strict criteria for classifying areas as 
built-up, whereas Version 7 followed much more relaxed, 
more inclusive criteria. 


Product description 


Interested users should request the product for their area of 
interest from the map's producers. Before accessing the 
dataset, they have to sign a license agreement. Depending on 
the use they intend to make of the dataset, they can access 
the fine resolution version of the dataset (0.4 arcsec), which 
is only available for scientific purposes, or the coarser ver- 
sion (2.8 arc seconds). In both cases, the download only 
includes the raster file with the LUC information. 


437 
Downloads 


GUF 


— Raster file with built-up areas for the requested area of interest (.tiff) 


Legend and codification 


Code Label Code Label 
0 Non-built-up areas 128 No data 
255 Built-up areas 


Practical considerations 


The dataset can be consulted online at the two spatial res- 
olutions available. A short document summarizing the 
technical characteristics of the product and its methodology 
is also available online. 

Many other interesting data sources for characterizing urban 
areas can be found at the U-TEP Visualisation and Analytics 
Toolbox.? Users can also visualize the GUF-DenS 2012, which 
is not available for download. This dataset is complementary to 
GUF and provides information on the percentage of sealed 
surface for all the areas classified as built-up in GUF. 


7 https://geoservice.dlr.de/web/maps/eoc:guf:3857. 

8 https://www.dlr.de/eoc/en/PortalData/60/Resources/dokumente/guf/ 
GUF Product Specifications GUF DLR. vOI.pdf. 

? https://urban-tep.eu/puma/tool/?id=567873922. 
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8 WSF—World Settlement Footprint 


Product 

LULC thematic 

Dates 

1985-2015, 2014 / 15, 2019 

Formats 

Raster 

Pixel size 

10 m, 30 m, 100 m, 250 m, 500 m, 1 km, 10 km 
Theme 


Extent of settlement areas (10 m) 
Percentage of settlement areas (100 m, 250 m, 500 m, 1 km, 10 km) 


Extent 

Global 

Updating 

Expected 

Change detection 

Not yet (will be available with updates) 


Overall accuracy 


Expected to be > 86% 


Website of reference Website Language English 
https://www.esa.int/Applications/Observing the Earth/Mapping our global human footprint 

Download site 

https://springernature.figshare.com/collections/Outlining where humans live - The World Settlement Footprint. 2015/4712852/1 


Availability Format(s) 
Open Access „tiff 
Technical documentation 

Marconcini et al. (2020) 

Other references of interest 

Esch et al. (2018a), Esch et al. (2020) 


Global Thematic Land Use Cover Datasets ... 
Project 


The World Settlement Footprint (WSF) is a dataset produced 
by the German Aerospace Center (DLR) within the context 
of a project (SAR4URBAN) funded by the European Space 
Agency (ESA) in which Synthetic Aperture Radar (SAR) is 
used to monitor urbanization. The project aimed to develop a 
new method to automatically map built-up areas via the joint 
use of radar and optical data. 

The dataset obtained is useful for the characterization and 
analysis of urban patterns across the world. It overcomes the 
limitations of previous high spatial resolution datasets 
mapping impervious surfaces by making use of both radar 
and optical imagery at the same time. This allows WSF to 
avoid the misclassifications that can result from using only 
one of the two types of sensors: optical imagery misclassifies 
sand and bare soil, whereas radar imagery misclassifies 
complex topography areas and forested regions. 

WSF is produced by the same institution as the Global 
Urban Footprint (GUF) described earlier in this chapter. In spite 
of this, it overcomes some of the limitations associated with 
GUF, such as the misclassifications arising from the use of 
single-date scenes and the use of commercial imagery, which 
makes updating more difficult due to the associated costs. 
Like GUF, WSF was also developed within the framework of 
the Urban Thematic Exploitation Platform (U-TEP) ofthe ESA. 

The dataset was originally produced at a spatial resolution 
of 10 m, although resampled versions at 100 m, 250 m, 
500 m, 1 km and 10 km are also available for download. 
The resampled versions show the percent of settlement area 
in each pixel instead of a binary classification differentiating 
between settlement and non-settlement areas. 

The DLR is currently working with the Google Earth 
Engine Team on the update of the product, creating a 
WSF-Evolution dataset that will map the global evolution of 
built-up surfaces yearly from 1985 to 2015. 


Production method 


The WSF production methodology was first tested at a range 
of selected sites and, once validated, was applied to generate 
the global dataset. It used Sentinel-1 and Landsat 8 data for 
the reference years 2014 and 2015 as input. 

From Sentinel-1 data, key temporal statistics were extrac- 
ted from the original backscattering value. From Landsat 8 
imagery, different spectral indices were extracted: vegetation 
index, built-up index etc. Based on the extracted information, 
a binary classification (settlement/non-settlement) was com- 
puted through an ensemble of Support Vector Machines 
(SVM) classifiers for each type of input data: radar and 
optical. The two results were then combined. 

In a post-classification stage, the obtained result was 
assessed against reference information, following the 
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post-editing object-based approach applied in the production 
of GUF. The auxiliary datasets were: Open Street Maps, 
GLC30, SRTM DEM, ASTER DEM, NLCD and the 
High-Resolution Layer on imperviousness. 


Product description 


WSF can be downloaded at multiple spatial resolutions. For 
the original resolution (10 m), the users will download a 
compressed file (.zip) that includes all the raster files into 
which the dataset is split (306.tiff files). The download also 
includes a virtual raster that merges all the tiles in a single 
mosaic. For all other available resolutions (100 m, 250 m, 
500 m, 1 km and 10 km), users can only download a .tiff file 
with data on the settlement percentage per pixel. No auxil- 
iary information is provided in either of the two cases. 


Downloads 
WSF 10 m 
— Raster files with the settlement extent for the 306 tiles into which 


the product is divided (.tiff) 
— Raster file with a mosaic of the WSF tiles (.vrt) 


WSF 100 m, 250 m, 500 m, 1 km, 10 km 


— Raster files with the settlement percentage (.tiff) 


Legend and codification 


WSF 10 m 

Code Label Code Label 

0 Non-settlement 255 Settlement 

WSF 100 m, 250 m, 500 m, 1 km, 10 km 

Code Label Code Label 
0-100 Settlement percent (0—10096) 255 Settlement 


Practical considerations 


WSF is considered by the authors to be the most accurate 
dataset of its type. It is part of the U-TEP tool, which also 
distributes many other datasets for characterizing urban areas 
that may be of interest to users. Users can access an online 
visualization of the dataset on the U-TEP tool website.'® 

For more detailed information about the characteristics of 
the dataset, we recommend interested users to read the sci- 
entific paper in which it was presented. 


1 https://urban-tep.eu/puma/tool/?id=574795484&lang=en. 
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9 GISM—Global Impervious Surface Map 
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Product 


LULC thematic 


Dates 


2015 


Formats 


Raster 


Pixel size 


30m 


Theme 


Extent of impervious areas 


Extent 


Global 


Updating 
Not expected 


Change detection 


No (only one date) 


Overall accuracy 


Expected to be >95% 


Website of reference 


Website Language English 


Not available 


Download site 


https://zenodo.org/record/3505079#. YFoEDDhKiUk 


Availability 


Format(s) 


Open Access 


tiff 


Technical documentation 


Zhang et al. (2020) 


Other references of interest 


Global Thematic Land Use Cover Datasets ... 


Project 


A group of researchers from Chinese institutions (Chinese 
Academy of Sciences, University of Science and Technol- 
ogy) and the University of Wisconsin-Milwaukee produced 
a Global Impervious Surface Map, which aimed to overcome 
some of the limitations of previous datasets. 

GISM is part of recent efforts to produce a detailed global 
mapping of artificial or impervious surfaces with a high level 
of accuracy to provide useful data that can help characterize 
artificial areas and their associated environmental and 
socioeconomic impacts. The dataset was produced with that 
aim, without any further updates being planned. 


Production method 


GISM was obtained by classifying Landsat and Sentinel-1 
data in the Google Earth Engine (GEE) platform, using the 
MSMT RF method. First, temporal-spectral-textural fea- 
tures were extracted from Landsat imagery. Then, 
temporal-SAR features were extracted from Sentinel-1 
imagery. On the basis of all these features, a classification 
was carried out with a random forest classifier in 5° x 5° 
tiles. Training data for the classification were obtained from 
GLC30, VIRS NTL and MODIS EVI imagery. 
SRTM DEM was used as an auxiliary dataset in the classi- 
fication process. 


Product description 


GISM is distributed as a single compressed file (.zip) con- 
taining all the raster files into which the product is dis- 
tributed: 954 5 x 5 degree tiles. No auxiliary information is 
provided. 


Downloads 


GISM 


— Raster files mapping impervious areas for each of the tiles into 
which the dataset was divided (.tiff) 


Legend and codification 


Code Label Code Label 


1 Non-impervious 2 Impervious 


Practical considerations 


The only other relevant information on the dataset can be 
found in the scientific paper in which it was presented. Users 
wishing to find out more about the characteristics of this 
product should consult this paper. 
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Abstract funded by the European Commission and no updates are 
expected. The datasets reviewed in this chapter for other 
parts of the world focus on vegetation covers of tropical 
forests and other relevant areas in terms of biodiversity 
and environmental studies. These datasets were produced 
within projects funded by the European Commission and 
the United States Agency for International Development. 
Unlike the previous datasets for Europe, they are already 
outdated and are usually produced at coarser spatial 
resolutions: Insular Southeast Asia—Forest Cover Map 
(1 km, 1998/00); Continental Southeast Asia—Forest 
Cover Map (1 km, 1998/02). For its part, the Congo 
Basin Monitoring dataset, although outdated, provides 
information at a higher resolution (57 m) for two different 
dates: 1990, 2000. The Joint Research Centre of the 
European Commission also produced an African cropland 
mask as a source of information for policy-makers. Of all 
the datasets reviewed in this chapter, it is the only one to 
focus on agricultural covers. It was obtained from data 
fusion at 250 m. Consequently, it does not show the 
cropland areas of Africa for a specific date across the 
whole continent. 


Supra-national thematic Land Use Cover (LUC) datasets 
are not very common. While there are several general 
datasets mapping all the land uses or covers in different 
supra-national areas across the world, LUC datasets with 
a similar extent that focus on the mapping of specific land 
covers in greater thematic detail are scarce. In this 
chapter, we review six different supra-national thematic 
LUC datasets. Three others were also found in the 
literature, but are not fully available for download, 
namely the TREES Vegetation Map of Tropical South 
America, the Central Africa— Vegetation map and 
FACET. The Circumpolar Arctic Region Vegetation 
dataset was also excluded from this review because of 
its specificity and coarse scale (1:7,500,000). Europe is 
the continent with the most relevant, most updated and 
most detailed LUC thematic datasets at supra-national 
scales. This is due to the work being done by the 
European Commission through its Joint Research Centre 
(JRC) and the Copernicus Land Monitoring Programme. 
The High-Resolution Layers (HRL) provide very detailed 
information, both thematically and spatially (from 10 m), 
for five different themes: imperviousness, tree cover, 
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1 Insular Southeast Asia—Forest Cover Map 


Website of reference 


Product 

LULC thematic 

Dates 

1998 / 00 

Formats 

Raster 

Pixel size 

1 km 

Theme 

4 forest classes out of 10 
Extent 

Insular Southeast Asia 
Updating 

Not expected 

Change detection 

No (only one date) 
Overall accuracy 

Not specified 

Website Language English 


https://forobs.jrc.ec.europa.eu/products/veget map. insulare-sea/insularSEasia.php 


Download site 


https://forobs.jrc.ec.europa.eu/products/veget map. insulare-sea/download. forest cover map. isea.php 


Availability 

Open Access 

Technical documentation 
Stibig et al. (2002, 2003a, b) 


Other references of interest 


Format(s) 
„tiff 
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Supra-National Thematic Land Use Cover Datasets 
Project 


The Joint Research Centre (JRC) of the European Com- 
mission produced a map for Insular Southeast Asia which 
sought to provide a more accurate characterization of the 
forest covers in this region. It aimed to overcome the limi- 
tations associated with the mapping of vegetation covers in 
tropical regions, due to the persistence of cloud covers. 

The dataset covers Malaysia, Singapore, Indonesia, Bru- 
nei, East Timor, the Philippines and Papua New Guinea. It is 
especially useful for research into deforestation and biodi- 
versity due to the significance of the insular Southeast Asia 
forest ecosystem for the world as a whole. 

The dataset was produced within the context of the 
TRopical Ecosystem Environment observations by Satellite 
(TREES) project. The project aimed to produce regularly 
updated information to monitor forest covers in tropical 
regions at regional scales. 


Production method 


The forest map for Insular Southeast Asia was produced 
through the unsupervised classification (clustering and 
maximum likelihood classification) of a mosaic of imagery 
collected by the VEGETATION sensor of the SPOT satellite 
for the period 1998—2000. 

The unsupervised classification identified 60 spectral 
clusters. They were manually interpreted and labelled on the 
basis of information provided by other satellite imagery, 
maps of reference and field data. In addition, the initial set of 
clusters was regrouped on the basis of information provided 
by two auxiliary datasets: GTOPO30 DEM and WCMC 
forest map. After this initial processing, the remaining 
clusters were finally grouped into 8 LUC categories and a 
No-Data category. 


445 
Product description 

The forest map for Insular Southeast Asia can be down- 
loaded as a single compressed file (.zip) containing the raster 
with the LUC information. No auxiliary information is 
provided. 


Downloads 


Insular Southeast Asia—Forest Cover Map 


— A raster with the LUC information (.tiff) 


Legend and codification 


Code | Label Code | Label 

0 No data 5 Cropland 

1 Evergreen montane 6 Burnt/dry/sparse 
forest vegetation 

2 Evergreen lowland 7 Non-forest vegetation 
forest 
Mangrove forest 8 Water 

4 Swamp forest 


Practical considerations 


A full characterization of the dataset is provided in the 
technical report published by the European Commission and 
in the technical documentation cited above. 

The map comes with several limitations: a few seasonal 
monsoon forests in Sulawesi, New Guinea and Philippines 
were not mapped as an individual category, while degraded 
forest cover and mature stages of forest regrowth were 
sometimes mapped as forest. 
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2 Continental Southeast Asia—Forest Cover 
Map 


Website of reference 


Product 

LULC thematic 

Dates 

1998 / 00 

Formats 

Raster 

Pixel size 

1 km 

Theme 

8 forest / wood classes out of 14 


Extent 
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Bangladesh, Myanmar, Thailand, Laos, Cambodia, the Himalayas 
mountain range, north-eastern India and southern China 


Updating 

Not expected 

Change detection 

No (only one date) 

Overall accuracy 

Not specified 

Website Language English 


https://forobs.jrc.ec.europa.eu/products/veget map continental-sea/continentalSEasia.php 


Download site 


https://forobs.jrc.ec.europa.eu/products/veget map continental-sea/download, forest cover map csea.php 


Availability 

Open Access 

Technical documentation 
Stibig et al. (2004) 


Other references of interest 


Format(s) 
„tiff 


Supra-National Thematic Land Use Cover Datasets 
Project 


The forest map for Continental Southeast Asia was devel- 
oped by the Joint Research Centre (JRC) of the European 
Commission within the context of the TRopical Ecosystem 
Environment observations by Satellite (TREES) and 
GLC2000 projects. Other LUC maps on forest covers for 
Insular Southeast Asia and Central Africa were also devel- 
oped as part of the TREES project, following similar map- 
ping workflows. They are all reviewed in this chapter. 

The project aimed to provide regularly updated LUC 
information on tropical forests to help monitor activities in 
these regions. The obtained dataset covers Bangladesh, 
Myanmar, Thailand, Laos, Cambodia, the Himalaya moun- 
tain range and tropical areas of north-eastern India and 
southern China. 


Production method 


The dataset was produced through unsupervised classifica- 
tion of a cloud free mosaic of VEGETATION imagery for 
the period 1998-2000. The classification identified 70 
spectral clusters, which were manually labelled and inter- 
preted on the basis of information provided by Landsat 
imagery, field-collected data and a DEM. For the labelling 
and interpretation of spectral classes, the mapped area was 
split into 11 geographic strata, covering the different types of 
climate, landscape and land cover in the region. Finally, the 
labelled clusters were grouped together in 12 land cover 
categories. 


Product description 


The forest map can be downloaded in a single compressed 
file (.zip). No additional information is provided. 


Downloads 
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Continental Southeast Asia—Forest Cover Map 


— A raster file containing the LUC information (.tiff) 


Legend and codification 


Code | Label Code | Label 
0 No data 7 Evergreen wood and 
shrubland and regrowth 
mosaics 
1 Evergreen Mountain 8 Deciduous wood and 
forests shrubland and regrowth 
mosaics 
2 Evergreen Lowland 9 Mosaics of Cropping 
forests and Regrowth 
3 Fragmented and 10 Other lands 
degraded evergreen 
forest cover 
4 Deciduous forests 11 Other lands 
Mangrove forests 12 Rocks 
6 Swamp forests and 13 Water bodies/Sea 


inundated shrubland 


Practical considerations 


Although a technical report describing the characteristics of 
the dataset was published, it is not currently available. The 
available information is therefore limited. In addition, the 
spatial resolution of the map (1 km) limits its capacity to 
map gradual local transitions in tree canopies, such as the 
degradation or fragmentation of forest canopies. 
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3 Congo Basin Monitoring Maps 


Product 
LULC thematic 
Dates 

1990 / 00 
Formats 
Raster 

Pixel size 
57m 

Theme 


Forest extent (2000) 
Forest probability (2000) 
Forest cover clearing (1990-2000) 


Extent 

Congo River Basin 
Updating 

Not expected 
Change detection 


Information on forest cover clearing for the period 1990—2000 
Overall accuracy 

Not specified 

Website of reference Website Language English 
https://glad.umd.edu/congo-basin-monitoring 

Download site 

https://glad.umd.edu/congo-basin-monitoring 

Availability Format(s) 

Open Access tiff, img 

Technical documentation 

Hansen et al. (2008) 

Other references of interest 

Lindquist et al. (2008) 


Supra-National Thematic Land Use Cover Datasets 
Project 


Maps of the Congo Basin Monitoring project were devel- 
oped within the context of the Central African Regional 
Program for the Environment (CARPE), funded by the 
United States Agency for International Development 
(USAID). The program aims to promote sustainable resource 
management in the Congo Basin region, for which the 
provision of accurate monitoring data is vital. 

The resulting LUC maps provide a useful resource for 
monitoring humid tropical deforestation at high spatial res- 
olutions. Previous LUC datasets mapping humid tropical 
regions had insufficient spatial resolution. Central Africa 
forest covers are not subject to large-scale clearings and 
instead suffer smaller clearing processes taking place at a 
local level. This means that monitoring projects at coarse 
resolution miss many of the key landscape dynamics. Pre- 
vious attempts to map the humid forests of Central Africa 
also faced important methodological limitations because of 
the lack of cloud-free imagery for the area. The Congo Basin 
Monitoring project aimed to overcome these limitations. 

Two maps were produced for the Congo Basin as part of 
this project: a forest mask and a forest probability map that 
also offers information on forest clearing for the period 
1990-2000. Forest clearing is defined as complete removal 
of the forest over story. 


Production method 


A forest mask was first created from a forest percent tree 
cover layer at 250 m generated after the classification of 
MODIS imagery (2000-2004) using the Vegetation Con- 
tinuous Field (VCF) method. 34 metrics from MODIS 
imagery were extracted to carry out the classification. 
A threshold of 6096 was applied to this layer to generate the 
forest mask: all pixels with a forest percentage of over 6096 
were considered forest. All the remaining pixels were con- 
sidered non-forest. Two other categories were also classified 
from MODIS imagery based on a classification tree algo- 
rithm: water and rural complex. Water pixels were treated as 
non-land in the forest mask, and rural complex pixels were 
considered non-forest. 

A forest probability layer was obtained from the classi- 
fication of Landsat imagery at the scene level for two dif- 
ferent epochs: pre-1996 (1986-1996) and post-1996 
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(21996-2003). The classification was performed on the basis 
of tree models using the previously obtained forest mask as 
the dependent variable and the Landsat imagery as the 
independent variable. Forest cover changes between the two 
periods were mapped through a multi-date direct classifica- 
tion of change methodology, using training data at the same 
locations for the two available epochs. 


Product description 


The forest map can be downloaded as a single compressed 
file (.zip) in .tiff format. The forest probability layer is 
available in two different formats (.tif and .img). In both 
cases, the download includes the raster file with the LUC 
information and a text file with a technical description of 
each dataset. 


Downloads 
Forest probability and forest cover clearing 
— Raster file with information on forest probability and forest cover 


clearing (.tiff) 
— A text file with a technical description of the dataset (.txt) 


MODIS-based evergreen tropical forest map (forest mask) 


— Raster file with information on the forest extent (.tiff) 
— A text file with a technical description of the dataset (.txt) 


Legend and codification 


Forest probability and forest cover clearing 


Code Label 

0-100 Forest probability (0—10096) 

253 Forest clearing between 1990s and 2000s 
250 Water 

254, 255 No data 


MODIS-based evergreen tropical forest map (forest mask) 
Code Label Code Label 


0 Non forest 1 Forest 
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4 MARS Crop Mask Over Africa 


Website of reference 
https://ec.europa.eu/jrc/en/mars 
Download site 

Not available 

Availability 

On request to authors 
Technical documentation 
Vancutsem et al (2013) 

Other references of interest 
Pérez-Hoyos et al. (2017a, b) 


Product 

LULC thematic 
Dates 

One-date (varies from one product to the next) 
Formats 

Raster 

Pixel size 

250 m 

Theme 

Cropland extent 
Extent 

Africa 

Updating 

Not expected 
Change detection 
No (only one date) 


Overall accuracy 


D. Garcia-Alvarez et al. 


Expected to be > 70% for most of the mapped countries 


Website Language English 


Format(s) 
„tiff 


Supra-National Thematic Land Use Cover Datasets 
Project 


The Monitoring Agricultural Resources (MARS) unit of the 
Joint Research Centre (JRC) produced a cropland mask for 
Africa to assist the unit and Commission's activities with 
crop and food security monitoring. The mask aimed to 
provide the most accurate information possible on cropland 
covers for Africa by merging the best available LUC crop- 
land data sources. 

The methodology applied in the production of this dataset 
has also been used in the development of other cropland 
masks (ASAP Land Cover Masks) by the same team. 


Production method 


The MARS crop mask was obtained by merging the best 
available LUC data sources on cropland covers. To this end, 
all the input data sources were resampled or rasterized to a 
common spatial resolution (250 m) and projected with the 
same parameters. Cropland categories were extracted from 
each input dataset. LUC categories were considered as 
cropland when at least 5096 of their surface was covered by 
cropland. LUC categories with a cropland proportion of 
between 20 and 5096 were manually checked by experts, 
who decided whether to include them as cropland categories 
at a global level or for just one specific region. 

The accuracy of each dataset was assessed against Google 
Earth imagery. When several datasets were available for the 
same area, the most accurate one was selected. If several 
datasets had similar levels of accuracy, the most detailed or 
recent was selected. 

The input datasets were Globcover, SADC, Cropland Use 
Intensity datasets from USGS, Woody Biomass map of 
Ethiopia, AFRICOVER, JRC-MARS crop masks, LULC 
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2000 USGS datasets and national land cover maps of the 
Democratic Republic of Congo, Mozambique and Senegal. 


Product description 


The crop mask is available in Google Drive on request to the 
producers of the map. The download includes a document 
with a technical description of the product as well as the 
raster file with the LUC information. Another raster file is 
provided with information about the data source that was 
finally selected to create the crop mask in each case. 


Downloads 


MARS crop mask over Africa 


— Raster file with crop extent (.tiff) 

— Raster file with information on the data source used to map each 
area (.tiff) 

— Document with a technical description of the dataset (.doc) 


Legend and codification 


Code Label Code 
0 Cropland 1 


Label 
No cropland 


Practical considerations 


Users interested in accessing the dataset should apply to the 
map’s authors (Christelle.vancutsem@ec.europa.eu). This 
map was obtained by merging data from selected data 
sources. The dataset cannot provide LUC information for 
any specific reference year as each source had its own. 
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5 HRL—High Resolution Layers 


Website of reference 


https://land.copernicus.eu/pan-european/high-resolution-layers 


Download site 


https://land.copernicus.eu/pan-european/high-resolution-layers 


Availability 
Open Access after registration 


Technical documentation 


D. Garcia-Alvarez et al. 


Product 
LULC thematic 
Dates 


2006, 2009, 2012, 2015, 2018 (Imperviousness) 
2012, 2015, 2018 (Forests) 

2015, 2018 (Grassland, Wetness and Water) 
2015 (Small Woody Features) 


Formats 

Raster 

Pixel size 

5 m (Small Woody Features) 
10 m (Products since 2018) 


20 m (Products up to 2015) 
100 m (Mosaics) 


Themes 


Extent and percentage of impervious areas 

Percentage of tree cover areas, leaf type and forest type 
Extent of grassland areas 

Wetness and water covers (5 water/wet classes out of 8) 
Extent of Small Woody Features 


Extent 

Europe (39 countries) 
Updating 

Planned every 3 years 
Change detection 
Through change layers 
Overall accuracy 


Imperviousness HRL, Forests: expected to be > 90%Grassland HRL, 
Wetness and Water HRL: expected to be > 80-80%Wetness and Water: 
HRL expected to be > 80% 


Website Language English, German and French 


Format(s) 
„tiff 


Copernicus Land Monitoring Service (2020a, b, c, d), D'amico et al. (2019), Faucqueur et al. (2018), Langangke (2015, 2016), Langangke et al. 
(2017, 20182, b, 2019), Pennec et al. (2019a, b), Smith et al. (2019), Weirather et al. (2019a, b) 


Other references of interest 


Büttner et al. (2016), Manakos et al. (2018), Sannier et al. (2017) 


Supra-National Thematic Land Use Cover Datasets 
Project 


The High-Resolution Layers are produced within the 
framework of the Copernicus Land Monitoring Programme. 
They were created as a means of overcoming some of the 
limitations associated with CORINE Land Cover (CLC), 
such as lack of detail, the presence of mixed classes and the 
difficulty of adapting the CLC legend to other common 
classification. schemes, such as the FAO LCSS. Each 
High-Resolution Layer is associated with one of the CLC 
Level 1 classes: artificial surfaces (Imperviousness HRL), 
agricultural areas (Grassland HRL), forest and semi-natural 
areas (Forests HRL), wetlands and water bodies (Water & 
Wetness HRL). 

The different High-Resolution Layers are separately 
produced using specific methods. Since 2018, they have 
been produced at enhanced spatial resolution (10 m) based 
on Sentinel imagery. This marks a change in the method- 
ology applied in the production of HRL compared to the 
layers created for previous years of reference. 

Some of the HRL layers have been produced for more 
years than the others, such as the Imperviousness HRL, 
available since 2006, and the Forests HRL, available since 
2012. However, when available, the reference years are 
almost all the same for all the layers. The only exception is 
the recently created Small Woody Features HRL. In some 
cases, when more than one date is available, change layers 
have been developed. 


Production method 


Each HRL has its own specific production method, as each 
theme is characterized in a different way. Nevertheless, all 
the HRLs are obtained by automatic classification. and 
interactive rule-based classification of high-resolution ima- 
gery, mostly from the Sentinel constellation. The Impervi- 
ousness HRL and Water and Wetness HRL are obtained 
from both optical and raster data, while the Forests, Grass- 
lands and Small Woody Features HRLs are obtained 
exclusively from optical data. 

Change layers are obtained by comparing the status layers 
for two different years of reference. For the changes between 
2018 and the previous year of reference, some uncertainties 
may arise because of the change in the spatial resolution: 
10 m vs 20 m. The production teams have implemented 
various different measures to prevent such uncertainties, 
including the development of supporting layers that inform 
about the changes that take place due to technical reasons 
and the level of confidence of the obtained change layer. 

Initial production of the HRL is centralized. Then, each 
country reviews and verifies the results, so enhancing this 
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initial product. For more detailed information about the 
production process of all the HRLs, readers are referred to 
the technical documentation cited above. 


Product description 
Imperviousness HRL 


The Imperviousness HRL can be separately downloaded 
for each year of reference or for each period of changes. In 
the latter case, users can choose between an uncategorized 
file showing the change in the degree of imperviousness 
and a file that categorizes this change in a series of classes. 
For the reference year 2018, users can also download the 
Impervious built-up layer as a separate file. This is a 


binary map differentiating built-up areas from 
non-built-up areas. 
The layers are disseminated at country level in 


100 x 100 km tiles. Users download a single file with all 
the tiles covering the selected country. A mosaic of all the 
mapped countries is also available as a single file at two 
spatial resolutions: 10-20 m (the original resolution) and 
100 m. 

Different supporting layers are also available for down- 
load as part of the Imperviousness HRL. Unlike the previous 
layers, they are available in the “Expert Products” section as 
single files covering all of Europe. These supporting layers 
include (i) a layer indicating the change in the degree of 
imperviousness between 2015 and 2018 due to technical 
reasons (IMCS); (ii) a layer showing the confidence level of 
the Imperviousness density 2018 layer at 10 m (IMDCL); 
and (iii) an adaptation of the Imperviousness density 2015 
layer to a spatial resolution of 10 m, to enable researchers to 
study changes in the impervious area between 2015 and 
2018 (IMDR). 

All downloads have the same contents: a raster file con- 
taining the LUC information, a file to symbolize it in any 
GIS software and a metadata file. Files for the pre-2018 
editions of Imperviousness HRL also include an Excel file 
with technical information about the product. 


Downloads 


Imperviousness built-up 2018 (Status Map) 

Imperviousness density 2018 (Status Map) 

Imperviousness Change 2015-2018 (Change Map) 
Imperviousness Classified Change 2015-2018 (Change Map) 


— Raster file with LUC information (.tiff) (DATA folder) 
— Text file to symbolize the raster in QGIS (.txt) (Symbology folder) 
— Metadata file (Metadata folder) 
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Legend and codification 


Imperviousness built-up (Status Map) 


Code Label Code Label 
0 Non built-up 255 Outside area 
1 Built-up 


Imperviousness density (Status Map) 


Code Label Code | Label 
0 Non-impervious areas 254 Unclassifiable 
1-100 | Degree of imperviousness (96) 255 Outside area 


Imperviousness Change (Change Map) 


Code Label 

0-99 Percentage of decreased imperviousness density 

100 Unchanged areas with some degree of imperviousness 
101- Percentage of increased imperviousness density 

200 

201 Unchanged areas with no degree of imperviousness 
254 Unclassifiable (no satellite image available, or clouds, 


shadows, or snow) 


Imperviousness Classified Change (Change Map) 


Code | Label 

0 Unchanged areas with Imperviousness Density = 0% 

1 New cover (increasing imperviousness density, which was 
0% at first reference date) 

2 Loss of cover (decreasing imperviousness density, which 
was 0% at second reference date) 

10 Unchanged areas with Imperviousness Density > 0% at 
both reference dates 

11 Increased Imperviousness Density (>0% at both reference 
dates) 

12 Decreased Imperviousness Density (2096 at both reference 
dates) 

254 Unclassifiable 

255 Outside area 


Forests HRL 


For each available year of reference, three different types of 
layer can be downloaded as part of the Forests HRL: (i) a 
layer showing the forest density or the degree of tree cover 
(Tree Cover Density); (ii) a layer informing about the 
dominant leaf type, distinguishing mainly between broadleaf 


D. García-Álvarez et al. 
and coniferous trees (Dominant Leaf Type); and (iii) a layer 
informing about the dominant leaf type in treed areas cov- 
ering more than 0.5 ha and with a tree cover density of over 
10%, i.e. those areas considered as forest according to the 
FAO definition (Forest Type). 

Change layers for Tree Cover and Dominant Leaf Type 
are also provided for each mapped period. A layer of tree 
cover density changes was initially created for the period 
2012-2015. However, it has not been updated for the new 
mapping periods and is no longer distributed. 

In all cases, the layers are distributed at a country level in 
100 x 100 km tiles. A single file mosaic of each layer for all 
the mapped countries is also available at two spatial reso- 
lutions: 10-20 m (the original resolution) and 100 m. 

Nine additional layers were also produced as supple- 
mentary information to the Forests HRL for the year 2018. 
These can be downloaded from the "Experts products" 
section. They provide information about the broadleaved and 
coniferous cover densities at 100 m (BCD, CCD) as well as 
other relevant technical information about the production of 
the Forests HRL: level of confidence, data sources, etc. The 
technical documentation of HRL Forests includes a detailed 
description of each of these supporting layers. 

In all cases, the downloaded files include the raster with 
LUC information, a file to symbolize it in any GIS software 
and the product's metadata. Files for the pre-2018 editions of 
Forests HRL also include an Excel file with technical 
information about the product. 


Downloads 


Tree Cover Density 2018 

Tree Cover Change Mask 2015-2018 
Dominant Leaf Type 2015 

Dominant Leaf Type Change 2015-2018 
Forest Type 2018 


— Raster file with LUC information (.tiff) (DATA folder) 
— Text file to symbolize the raster in QGIS (.txt) (Symbology folder) 
— Metadata file (Metadata folder) 


Legend and codification 


Tree Cover Density 


Code Label Code Label 

0 Non-tree-covered 254 Unclassifiable 
areas 

1-100 | Percentage of tree 255 Outside area 


cover density 


Supra-National Thematic Land Use Cover Datasets 


Tree Cover Change Mask 


Code Label Code Label 
0 Unchanged areas 10 Unchanged areas with 
with no tree cover tree cover 
1 New tree cover 254 Unclassifiable in any of 
parent status layers 
2 Loss of tree cover 255 Outside area 
Dominant Leaf Type 
Code Label Code Label 
0 Non-tree-covered areas 254 Unclassifiable 
1 Broadleaved trees 255 Outside areas 
2 Coniferous trees 
Dominant Leaf Type 
Code Label Code Label 
0 Unchanged areas 10 Unchanged areas with 
with no tree cover tree cover 
1 New broadleaved 12 Potential change among 
cover dominant leaf types 
2 New coniferous 254 Unclassifiable in any of 
cover parent status layers 
3 Loss of broadleaved 255 Outside area 
cover 
4 Loss of coniferous 
cover 
Forest Type 
Code Label Code Label 
0 Non- 3 Mixed forest (only for 
tree-covered aggregated 100 m layer) 
areas 
1 Broadleaved 254 Unclassifiable 
trees 
2 Coniferous 255 Outside areas 
trees 
Grassland HRL 


A status layer for each reference year and a layer of changes 
for each mapped period can be downloaded separately as 
part of the Grassland HRL. Moreover, three additional 
supporting layers are distributed as “Expert products”: (1) a 
layer showing the probability of each pixel being grassland 
(Grassland Vegetation Probability Index, GRAVPD) (ii) a 
layer informing about the number of years since the last 
ploughing (Ploughing Indicator, PLOGH); and (iii) a con- 
fidence layer for the Grassland 2018 status map (GRACL). 
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The status layer and the change layers are distributed at 
country level in 100 x 100 km tiles. A single file European 
mosaic is also available at two spatial resolutions: 10-20 m 
(the original resolution) and 100 m. The three supporting 
layers can be downloaded as single files covering the whole 
of the mapped area. 

All downloads include the raster with LUC information, a 
file to symbolize it in GIS and a metadata file. Downloads 
for the pre-2018 editions of the layers also include an Excel 
file with technical information about the product. 


Downloads 


Grassland 2018 (Status Map) 
Grassland Change 2015-2018 (Change maps) 
— Raster file with LUC information (.tiff) (DATA folder) 


— Text file to symbolize the raster in QGIS (.txt) (Symbology folder) 
— Metadata file (Metadata folder) 


Legend and codification 


Grassland (Status Map) 


Code | Label Code | Label 
0 Non-grass areas 254 Unclassifiable 
1 Grassy and non- 255 Outside area 
woody vegetation 
Grassland Change (Change maps) 
Code | Label Code | Label 
0 All non-grassland 11 Unverified grassland gain 
areas 
1 Grassland gain 22 Unverified grassland loss 
2 Grassland loss 254 Unclassifiable in any of 
parent status layers 
10 Unchanged 255 Outside area 
grassland in both 
years 


Water and Wetness HRL 


The Water and Wetness HRL is made up of a main pro- 
duct mapping the different types of water and wetness 
covers in Europe. Users can also download an additional 
layer (Expert products) showing the probability of each 
pixel being water or wetness. Two extra technical layers 
are also available as expert products: one informs about 
the confidence of the 2018 status map (WACL) while the 
other studies the differences in the mapping of water and 
wetness covers between 2015 and 2018 (WAWCSL). 
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Different files can be downloaded for each available layer 
and year. The main layer is distributed at country level in 
100 x 100 km tiles. However, a single file mosaic is also 
available at the original resolution of the product (10—20 m) 
and at 100 m. The supporting layers are only available at the 
original resolution as single files covering the whole of 
Europe. 

All downloads include the raster with LUC information, a 
file to symbolize it in any GIS software and a metadata file. 
The available layer for 2015 also includes an Excel file with 
technical information about the product. 


Downloads 

Water and Wetness 2018-WAW (Status Map) 

— Raster file with LUC information (.tif) (DATA folder) 

— Text file to symbolize the raster in QGIS (.txt) (Symbology folder) 
— Metadata file (Metadata folder) 


Legend and codification 


Water and Wetness (Status Map) 


Code Label Code Label 

0 Dry 4 Temporary wet 
1 Permanent water 253 Sea water 

2 Temporary water 254 Unclassifiable 
3 Permanent wet 255 Outside areas 


Small Woody Features HRL 


The Small Woody Features HRL is available in either vector 
or raster files. Vector files can be downloaded in two dif- 
ferent formats: ESRI Geodatabase and GeoPackage. Raster 
files can be downloaded at two different spatial resolutions: 5 
and 100 m. 
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The vector and raster files at 5 m are distributed in tiles 
obtained after splitting each European country into a series 
of large regions. To find out which tile corresponds to their 
particular area of interest, users should consult the viewer on 
the dataset’s website.' The rasters at 100 m are distributed as 
single files covering the whole of Europe, without splits into 
regions. 

The raster at 5m only differentiates between Small 
Woody Features (SWF) and Additional Woody Features 
(AWF). The vector file also differentiates between SWF and 
AWF, although it splits the first category into linear and 
patchy structures. Three different layers are available at 
100 m: (i) the density of small woody features (SWF); 
(ii) the density of Additional Woody Features (AWF); and 
(iii) the density of both small and additional woody features 
(SWFAWEF). 


Downloads 


Small Woody Features 2018 (Geodatabase) 


— Vector file with LUC information (DATA folder) 

— Raster file with information about the accuracy of the product 
(tiff) 

— PDF with a guide about how to use the ESRI Geodatabase in 

QGIS (Documents folder) 

— PDF with information about the product (Documents folder) 

— Metadata about the product (Metadata folder) 


Small Woody Features 2018 (Raster 5 m) 


— Raster file with LUC information (.tiff) (Data folder) 

— File to symbolize the raster in GIS (.clr) (Data folder) 

— PDF with information about the product (Documents folder) 
— Metadata about the product (Metadata folder) 


SWF density (Raster 100 m) 
AWF density (Raster 100 m) 
SWF + AWF density (Raster 100 m) 


— Raster file with LUC information (.tiff) (Documents folder) 
— PDF with information about the product (Documents folder) 
— Metadata about the product (Metadata folder) 


! https://land.copernicus.eu/pan-european/high-resolution-layers/small- 
woody-features/small-woody-features-2015. 


Supra-National Thematic Land Use Cover Datasets 
Database 
Small Woody Features 2018 (Geodatabase) 
gid code area dass_name 
11 5301,36442179704 Linear structures of trees, hedges, bushes and scrub 


21 5376,72678401566 Linear structures of trees, hedges, bushes and scrub 
31 3579,55745263859 Linear structures of trees, hedges, bushes and scrub 


— Gid: Unique identifier for each polygon 

— Code: Thematic code for each polygon 

— Area: Area of the polygon, in square meters 

— Class name: Category assigned to each polygon 


Legend and codification 


Small Woody Features (Geodatabase and GeoPackage) 


Code | Label Code | Label 

1 Linear structures of trees, 3 Additional 
hedges, bushes and scrub woody features 

2 Patchy structures of trees, 


hedges, bushes and scrub 


Small Woody Features (Raster 5 m) 


Code Label Code Label 
0 Non-SWF area 254 Unclassifiable 
1 Patchy structures of trees, 255 Outside areas 


hSWF area (Linear or patchy 
structures of trees, hedges, 
bushes and scrub) 


3 Additional woody features 


SWF density (Raster 100 m) 

Code Label Code | Label 

0 Non-SWF area 254 Unclassifiable 
0-100 Small Woody Features density 255 Outside areas 


AWF density (Raster 100 m) 


Code Label Code 
0 Non-SWF area 254 
0-100 | Additional Woody 255 


Features density 


SWF+ AWF density (Raster 100 m) 


Code Label Code 
0 Non-SWF area 254 
0-100 Small+ additional 255 


Woody Features density 


Practical considerations 


Label 
Unclassifiable 


Outside areas 


Label 
Unclassifiable 


Outside areas 
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Users can consult the layers via the online viewers available 
at the product’s download website. The technical documents 
provide useful descriptions of the characteristics of the 
products and all the layers available for each year of refer- 
ence, including the expert products, which we have not been 


reviewed in detail. 
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6 ESM—European Settlement Map 


| Product 


LULC Thematic 
Dates 
2012, 2015 


Formats 


D. García-Álvarez et al. 


Raster 

Pixel size 

2 m, 10 m (2015) 

2.5 m, 10 m, 100 m (2012) 
Theme 


Extent of Built-up areas (2015) 
Extent of Residential areas (2012) 
13 built-up categories (2012) 
Percentage of built-up areas (2012) 


Extent 


Europe 


Updating 


Not planned 


Change detection 


No 


= 


| 
Overall accuracy 


Expected to be > 80% (ESM 2015 - 2 m) 
Expected to be > 70% (ESM 2015 - 10 m) 


Website of reference 


|! Website Language English 


l https://land.copernicus.eu/pan-european/GHSL/european-settlement-map 


Download site 


 https://land.copernicus.eu/pan-european/GHSL/european-settlement-map 


Availability 


Open access after registration 


Format(s) 


| „tiff 


Technical documentation 


Ferri et al. (2014, 2016a, 2017), Florczyk et al. (2016), Pafi et al. (2016a), Pesaresi et al. (2013), Sabo et al. (2019), Smith and Sannier (2017) 


Other references of interest 
Ferri et al. (2016b), Pafi et al. (2016b) 


Supra-National Thematic Land Use Cover Datasets 
Project 


The European Settlement Map (ESM) is part of the Global 
Human Settlement Layer (GHSL) project, supported by the 
European Commission through the Joint Research Centre 
(JRC) and the Directorate General for Regional and Urban 
Policy (DG REGIO). ESM complements the GHSL global 
products by providing an urban settlement map for Europe at 
a very detailed spatial resolution: 2-2.5 m versus 30 m for 
the GHSL. Both products share similar automatic methods 
for extracting LUC information from satellite imagery. 

ESM was initially released in 2014, with successive 
updates in 2016, 2017 and 2019. In 2014, a dataset was 
created for the reference year 2012, showing the percentage 
of the surface area that was built up. This was revised with a 
new production methodology in 2016 and again in 2017. 
The first update improved the accuracy of the product and its 
consistency with population data. The spatial resolution was 
also improved: from 100 to 10 m. The second update 
increased the spatial and thematic detail of the product, at 
2.5 m and differentiating between 12 classes. A new dataset 
at 2 m for the year 2015 was released in 2019, using a 
different production methodology. Unlike previous editions, 
this map only shows the extent of built-up areas, without 
providing further information about the built-up fraction per 
pixel. 

In addition to the base layer delineating built-up areas, the 
latest edition of the product (2019) includes a classification 
differentiating residential from non-residential areas at a 
spatial resolution of 10 m. 


Production method 


The ESM production method has changed over time, 
although it has always been fully automatic. The latest edi- 
tion (2019) was produced at 2 m on the basis of the 
Copernicus VHR. IMAGE 2015 imagery dataset, made up 
of images captured by the satellites Pleiades, Deimos-02, 
WorldView-2, WorldView-3, GeoEye-01 and Spot 6/7. The 
imagery was classified through a scene-based classification 
algorithm: Symbolic Machine Learning (SML). 

The first three editions of ESM were obtained at 100, 10 
and 2.5 m through a textural and morphological technique of 
unsupervised built-up area detection. Spot 6/7 imagery was 
used as an input. In the third edition (2017), auxiliary data 
sources (Open Street Map, Urban Atlas...) were also used to 
provide more thematic detail, distinguishing between 13 
LUC categories, instead of just between builtup and 
non-built-up areas. 
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Product description 


The ESM for each of the available editions can be down- 
loaded separately as a single file. If more than one spatial 
resolution is available, users must separately download the 
specific product for the spatial resolution they require. 

The ESM layers at 100 m for the 2014 and 2016 editions 
are distributed as a single European file. For the 2017 edition 
of ESM at 100 m, users must download a different file 
covering the entire mapped area for each of the categories 
(13 in total). The 2016 edition at 10 m is also distributed in 
400 x 400 km tiles. Finally, the ESM layers at 2-2.5 and 
10 m are distributed in 100 x 100 km tiles for the 2017 and 
2019 editions of the product. In all cases, users can find out 
which tile or tiles fall within their area of interest by con- 
sulting the viewer available on the ESM website. 


Downloads 


Due to the complexity of this product, with different editions 
available for the same years of reference at different spatial 
resolutions, in the following table we present an overview of 
all the available maps, classified according to the year they 
were released, their spatial resolution and the year of refer- 
ence, i.e. the year for which they map the LUC covers. The 
different files available for download are described below the 
table. 


Available products for download 


Product Edition Pixel size 

ESM 2012 2014 100 m 
2016 100 m 
2016 10m 
2017 100 m 
2017 10 m 
2017 2.5 m 

ESM 2015 2019 10 m 
2019 2m 


ESM 2012 (2014)—100 m 


— Raster file with built-up percentage (EU GHSL100m folder) 
— Raster files with technical information about the product 
(EU. GHSL100m Data, Mask and 
EU GHSLI00m, Data Processed Ref. Year folders) 


ESM 2012 (2016)—100 m, 10 m 


— Raster file with built-up percentage 
— Text file with a description of the product (.txt) 
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ESM 2012 (2017)—100 m 


— Raster file with class percentage per pixel for one of the classes 
mapped in 2nd edition of ESM 


ESM 2012 (2017)—10 m 


— Raster files with class percentage per pixel for each of the classes 
mapped in 2nd edition of ESM 


ESM 2012 (2017)—2.5 m 


— Raster file with LC information 
— Layer style file for ArcGIS (.lyr) and QGIS (.qml) 
— PDF with technical information about the product 


ESM 2015 (2019)—10 m, 2 m 


— Raster file with LC information 
— TXT files with map legend and copyright information 
— File for symbolizing the raster in GIS(.clr) 


Legend and codification 


ESM 2012 (2014)—100 m 


Code Label Code Label 
0-1 Built-up percentage (0-100946) -2 No data 
ESM 2012 (2016)—100 m 

Code Label 

0-1 Built-up percentage (0—10046) 
ESM 2012 (2016) —10 m 

Code Label 

0-100 Built-up percentage (0—10046) 


ESM 2012 (2017)—100 m-Class 50 (Buildings) 
Code Label 
0-1 Percentage (0—10096) of the selected class (50) 


ESM 2012 (2017)—10 m-Class 50 (Buildings) 
Code Label 
0-100 Percentage (0—10096) of the selected class (50) 


ESM 2012 (2017)—2.5 m 
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Code | Label Code | Label 

50 BU Buildings 20 NBU Area-Green 
NDVI 

45 BU Area-Street Green 15 NBU Area-Streets 

NDVI 

41 BU Area-Green UA 10 NBU Area-Open 
Space 

40 BU Area-Green NDVI 2 Railways 

35 BU Area-Streets 1 Water 

30 BU Area-Open Space 0 No Data 

25 NBU Area-Street Green 

NDVI 

ESM 2015 (2019)—10 m 

Code Label Code Label 

0 No data 250 Non-residential built-up area 

1 Land 255 Residential built-up area 

ESM 2015 (2019)—2 m 

Code Label Code Label 

0 No data 2 Water 

1 Land 255 Built-up area 


Practical considerations 


All editions of ESM are available for download at the 
Copernicus Land programme website.” The ESM 2015 can 
be consulted through an online viewer as part of the GHSL 
framework. It can also be downloaded from the same 
website in tiles.* 

The 2016 ESM edition at 10 m is distributed in 237 
400 x 400 km tiles. However, of the 237 tiles available for 
download, only 86 fall within areas with impervious sur- 
faces. Therefore, only 86 out of the 237 tiles include LUC 
information. 


 https://land.copernicus.eu/pan-european/GHSL/european-settlement- 
map. 

3 https://ghsl.jre.ec.europa.eu/ESM Visualisation. php. 

^ https://ghsl.jrc.ec.europa.eu/download.php?ds=ESM. 


Supra-National Thematic Land Use Cover Datasets 
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